| HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding | Jul 9, 2023 | Action RecognitionAction Segmentation | CodeCode Available | 0 | 5 |
| Hallucination Mitigation Prompts Long-term Video Understanding | Jun 17, 2024 | Answer GenerationHallucination | CodeCode Available | 0 | 5 |
| Video action detection by learning graph-based spatio-temporal interactions | Dec 9, 2019 | Action DetectionAction Localization | CodeCode Available | 0 | 5 |
| Spatio-Temporal Perturbations for Video Attribution | Sep 1, 2021 | Video Understanding | CodeCode Available | 0 | 5 |
| 4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding | Mar 22, 2025 | BenchmarkingObject | CodeCode Available | 0 | 5 |
| SoccerNet 2024 Challenges Results | Sep 16, 2024 | Action SpottingDense Video Captioning | CodeCode Available | 0 | 5 |
| Streaming Detection of Queried Event Start | Dec 4, 2024 | Autonomous Drivingparameter-efficient fine-tuning | CodeCode Available | 0 | 5 |
| Situational Scene Graph for Structured Human-centric Situation Understanding | Oct 30, 2024 | Graph GenerationPredicate Classification | CodeCode Available | 0 | 5 |
| Creative Flow+ Dataset | Jun 1, 2019 | 3D Character Animation From A Single PhotoDepth Estimation | CodeCode Available | 0 | 5 |
| ScVLM: Enhancing Vision-Language Model for Safety-Critical Event Understanding | Oct 1, 2024 | Contrastive LearningHallucination | CodeCode Available | 0 | 5 |
| Screencast Tutorial Video Understanding | Jun 1, 2020 | object-detectionObject Detection | CodeCode Available | 0 | 5 |
| ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding | May 29, 2025 | AvgVideo Understanding | CodeCode Available | 0 | 5 |
| SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding | Apr 30, 2025 | Video Understanding | CodeCode Available | 0 | 5 |
| Snippet-Aware Transformer With Multiple Action Elements for Skeleton-Based Action Segmentation | May 6, 2024 | Action SegmentationSkeleton Based Action Segmentation | CodeCode Available | 0 | 5 |
| Gaussian Temporal Awareness Networks for Action Localization | Sep 9, 2019 | Action Localizationobject-detection | CodeCode Available | 0 | 5 |
| Relation-aware Hierarchical Attention Framework for Video Question Answering | May 13, 2021 | Question AnsweringRelation | CodeCode Available | 0 | 5 |
| Re-ID-AR: Improved Person Re-identification in Video via Joint Weakly Supervised Action Recognition | Nov 1, 2021 | Action RecognitionPerson Re-Identification | CodeCode Available | 0 | 5 |
| Representation Flow for Action Recognition | Oct 2, 2018 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Contextual Explainable Video Representation: Human Perception-based Understanding | Dec 12, 2022 | Action DetectionAction Recognition | CodeCode Available | 0 | 5 |
| DriftNet: Aggressive Driving Behavior Classification using 3D EfficientNet Architecture | Apr 18, 2020 | Anomaly DetectionClassification | CodeCode Available | 0 | 5 |
| Recurrent Space-time Graph Neural Networks | Apr 11, 2019 | Action RecognitionHuman-Object Interaction Detection | CodeCode Available | 0 | 5 |
| FriendsQA: A New Large-Scale Deep Video Understanding Dataset with Fine-grained Topic Categorization for Story Videos | Dec 22, 2024 | Language ModellingLarge Language Model | CodeCode Available | 0 | 5 |
| Constrained-size Tensorflow Models for YouTube-8M Video Understanding Challenge | Aug 21, 2018 | Video Understanding | CodeCode Available | 0 | 5 |
| VideoDG: Generalizing Temporal Relations in Videos to Novel Domains | Dec 8, 2019 | Action RecognitionData Augmentation | CodeCode Available | 0 | 5 |
| SoccerChat: Integrating Multimodal Data for Enhanced Soccer Game Understanding | May 22, 2025 | Action ClassificationAutomatic Speech Recognition | CodeCode Available | 0 | 5 |