| LLaVAction: evaluating and training multi-modal large language models for action recognition | Mar 24, 2025 | Action RecognitionAction Understanding | CodeCode Available | 2 |
| OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding | Jun 11, 2024 | Action UnderstandingDiversity | CodeCode Available | 2 |
| F^3Set: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos | Apr 11, 2025 | Action UnderstandingEvent Detection | CodeCode Available | 1 |
| SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization | Jan 2, 2025 | Action RecognitionAction Understanding | CodeCode Available | 1 |
| Language-Assisted Skeleton Action Understanding for Skeleton-Based Temporal Action Segmentation | Oct 31, 2024 | Action SegmentationAction Understanding | CodeCode Available | 1 |
| EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding | Jun 13, 2024 | Action ClassificationAction Localization | CodeCode Available | 1 |
| FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment | May 11, 2024 | Action Quality AssessmentAction Understanding | CodeCode Available | 1 |
| Sports-QA: A Large-Scale Video Question Answering Benchmark for Complex and Professional Sports | Jan 3, 2024 | Action Understandingcounterfactual | CodeCode Available | 1 |
| FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding | Jan 1, 2024 | Action AnalysisAction Understanding | CodeCode Available | 1 |
| Open-Vocabulary Video Relation Extraction | Dec 25, 2023 | Action ClassificationAction Understanding | CodeCode Available | 1 |