| Open-Vocabulary Video Relation Extraction | Dec 25, 2023 | Action ClassificationAction Understanding | CodeCode Available | 1 |
| No More Shortcuts: Realizing the Potential of Temporal Self-Supervision | Dec 20, 2023 | Action ClassificationAttribute | —Unverified | 0 |
| ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room | Dec 19, 2023 | Action ClassificationActivity Recognition | —Unverified | 0 |
| Just Add π! Pose Induced Video Transformers for Understanding Activities of Daily Living | Nov 30, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| CAST: Cross-Attention in Space and Time for Video Action Recognition | Nov 30, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition | Nov 28, 2023 | Action ClassificationAction Recognition | —Unverified | 0 |
| Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning | Nov 27, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| ADM-Loc: Actionness Distribution Modeling for Point-supervised Temporal Action Localization | Nov 27, 2023 | Action ClassificationAction Detection | —Unverified | 0 |
| Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities | Nov 9, 2023 | Action ClassificationAudio Classification | —Unverified | 0 |
| OmniVec: Learning robust representations with cross modal sharing | Nov 7, 2023 | 3D Point Cloud ClassificationAction Classification | —Unverified | 0 |
| Asymmetric Masked Distillation for Pre-Training Small Foundation Models | Nov 6, 2023 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| After-Stroke Arm Paresis Detection using Kinematic Data | Nov 3, 2023 | Action ClassificationKnowledge Distillation | —Unverified | 0 |
| Proposal-based Temporal Action Localization with Point-level Supervision | Oct 9, 2023 | Action ClassificationAction Localization | —Unverified | 0 |
| ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video | Oct 2, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| SkeleTR: Towrads Skeleton-based Action Recognition in the Wild | Sep 20, 2023 | Action ClassificationAction Detection | —Unverified | 0 |
| MOFO: MOtion FOcused Self-Supervision for Video Understanding | Aug 23, 2023 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| Progression-Guided Temporal Action Detection in Videos | Aug 18, 2023 | Action ClassificationAction Detection | CodeCode Available | 0 |
| ALIP: Adaptive Language-Image Pre-training with Synthetic Caption | Aug 16, 2023 | Action ClassificationImage-text Retrieval | CodeCode Available | 1 |
| Temporally-Adaptive Models for Efficient Video Understanding | Aug 10, 2023 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| Joint Skeletal and Semantic Embedding Loss for Micro-gesture Classification | Jul 20, 2023 | Action ClassificationClassification | CodeCode Available | 0 |
| Actor-agnostic Multi-label Action Recognition with Multi-modal Query | Jul 20, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| What Can Simple Arithmetic Operations Do for Temporal Modeling? | Jul 18, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| Semi Supervised Meta Learning for Spatiotemporal Learning | Jul 9, 2023 | Action ClassificationClassification | —Unverified | 0 |
| Spiking Two-Stream Methods with Unsupervised STDP-based Learning for Action Recognition | Jun 23, 2023 | Action ClassificationAction Recognition | —Unverified | 0 |
| Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers | Jun 15, 2023 | Action ClassificationAction Recognition | CodeCode Available | 1 |