| Pooled Motion Features for First-Person Videos | Dec 19, 2014 | Activity RecognitionActivity Recognition In Videos | CodeCode Available | 0 |
| End-to-End Learning of Motion Representation for Video Understanding | Apr 2, 2018 | Action RecognitionOptical Flow Estimation | CodeCode Available | 0 |
| A Coding Framework and Benchmark towards Low-Bitrate Video Understanding | Feb 6, 2022 | Video CompressionVideo Understanding | CodeCode Available | 0 |
| Pairwise Emotional Relationship Recognition in Drama Videos: Dataset and Benchmark | Sep 23, 2021 | Video Understanding | CodeCode Available | 0 |
| EgoVLM: Policy Optimization for Egocentric Video Understanding | Jun 3, 2025 | EgoSchemaQuestion Answering | CodeCode Available | 0 |
| On the Pitfalls of Batch Normalization for End-to-End Video Learning: A Study on Surgical Workflow Analysis | Mar 15, 2022 | Video Understanding | CodeCode Available | 0 |
| ECO: Efficient Convolutional Network for Online Video Understanding | Apr 24, 2018 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions | Nov 24, 2024 | Action ClassificationAction Recognition | CodeCode Available | 0 |
| DriftNet: Aggressive Driving Behavior Classification using 3D EfficientNet Architecture | Apr 18, 2020 | Anomaly DetectionClassification | CodeCode Available | 0 |
| Video Representation Learning and Latent Concept Mining for Large-scale Multi-label Video Classification | Jul 5, 2017 | AttributeGeneral Classification | CodeCode Available | 0 |
| DramaQA: Character-Centered Video Story Understanding with Hierarchical QA | May 7, 2020 | Question AnsweringVideo Question Answering | CodeCode Available | 0 |
| Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly Videos | Feb 16, 2024 | Decision MakingVideo Understanding | CodeCode Available | 0 |
| NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels | Oct 13, 2021 | Action ClassificationSelf-Supervised Learning | CodeCode Available | 0 |
| NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video Classification | Nov 12, 2018 | Efficient Neural NetworkGeneral Classification | CodeCode Available | 0 |
| Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding | Nov 1, 2019 | Action DetectionAction Recognition | CodeCode Available | 0 |
| Dr^2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning | Jan 8, 2024 | object-detectionObject Detection | CodeCode Available | 0 |
| Multimodal Dialogue State Tracking | Jun 16, 2022 | Dialogue State TrackingVideo Understanding | CodeCode Available | 0 |
| Don't Judge by the Look: Towards Motion Coherent Video Representation | Mar 14, 2024 | Data AugmentationObject Recognition | CodeCode Available | 0 |
| (Un)likelihood Training for Interpretable Embedding | Jul 1, 2022 | Ad-hoc video searchDecoder | CodeCode Available | 0 |
| Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors from Images | Oct 4, 2018 | Domain AdaptationImage-to-Image Translation | CodeCode Available | 0 |
| video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models | Jun 22, 2024 | DiversityLanguage Modeling | CodeCode Available | 0 |
| X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-modal Knowledge Transfer | Dec 12, 2023 | Action RecognitionAction Segmentation | CodeCode Available | 0 |
| Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video Understanding | Apr 20, 2025 | Autonomous DrivingImage Captioning | CodeCode Available | 0 |
| Diagnosing Error in Temporal Action Detectors | Jul 27, 2018 | Action LocalizationDiagnostic | CodeCode Available | 0 |
| Multi-attention Networks for Temporal Localization of Video-level Labels | Nov 15, 2019 | Action RecognitionTemporal Action Localization | CodeCode Available | 0 |