| Spotting Temporally Precise, Fine-Grained Events in Video | Jul 20, 2022 | Action DetectionAction Spotting | CodeCode Available | 1 |
| Clover: Towards A Unified Video-Language Alignment and Fusion Model | Jul 16, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Is Appearance Free Action Recognition Possible? | Jul 13, 2022 | Action RecognitionOptical Flow Estimation | CodeCode Available | 1 |
| Federated Self-supervised Learning for Video Understanding | Jul 5, 2022 | Action RecognitionFederated Learning | CodeCode Available | 1 |
| ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning | Jun 27, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| REVECA -- Rich Encoder-decoder framework for Video Event CAptioner | Jun 18, 2022 | DecoderSemantic Segmentation | CodeCode Available | 1 |
| Stand-Alone Inter-Frame Attention in Video Models | Jun 14, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action Detector | Jun 7, 2022 | Action ClassificationAction Detection | CodeCode Available | 1 |
| From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering | May 30, 2022 | counterfactualDescriptive | CodeCode Available | 1 |
| Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions | May 19, 2022 | Contrastive LearningSelf-Supervised Learning | CodeCode Available | 1 |
| ETAD: Training Action Detection End to End on a Laptop | May 14, 2022 | Action DetectionGPU | CodeCode Available | 1 |
| BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection | May 5, 2022 | Action Detectionobject-detection | CodeCode Available | 1 |
| A Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions | Apr 21, 2022 | Action DetectionVideo Understanding | CodeCode Available | 1 |
| Temporal Alignment Networks for Long-term Video | Apr 6, 2022 | Action RecognitionAction Segmentation | CodeCode Available | 1 |
| An Empirical Study of End-to-End Temporal Action Detection | Apr 6, 2022 | Action ClassificationAction Detection | CodeCode Available | 1 |
| Long Movie Clip Classification with State-Space Video Models | Apr 4, 2022 | ClassificationDecoder | CodeCode Available | 1 |
| SPAct: Self-supervised Privacy Preservation for Action Recognition | Mar 29, 2022 | Action ClassificationAction Recognition | CodeCode Available | 1 |
| How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning? | Mar 27, 2022 | Self-Supervised LearningSensitivity | CodeCode Available | 1 |
| Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment | Feb 28, 2022 | 3D Action RecognitionAction Analysis | CodeCode Available | 1 |
| Learning Optical Flow with Adaptive Graph Reasoning | Feb 8, 2022 | Motion EstimationOptical Flow Estimation | CodeCode Available | 1 |
| A Dataset for Medical Instructional Video Classification and Question Answering | Jan 30, 2022 | ClassificationQuestion Answering | CodeCode Available | 1 |
| Video Joint Modelling Based on Hierarchical Transformer for Co-summarization | Dec 27, 2021 | RetrievalSupervised Video Summarization | CodeCode Available | 1 |
| Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation | Dec 16, 2021 | Contrastive LearningRepresentation Learning | CodeCode Available | 1 |
| Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection | Dec 9, 2021 | Boundary DetectionDiversity | CodeCode Available | 1 |
| Prompting Visual-Language Models for Efficient Video Understanding | Dec 8, 2021 | Action RecognitionLanguage Modelling | CodeCode Available | 1 |