SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 781–790 of 1149 papers

Title	Date	Tasks	Status	Hype
Dr^2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning	Jan 8, 2024	object-detectionObject Detection	CodeCode Available	0
VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding	Jan 1, 2024	Spatio-Temporal Video GroundingVideo Grounding	—Unverified	0
Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning	Jan 1, 2024	object-detectionObject Detection	—Unverified	0
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action	Jan 1, 2024	Image GenerationInstruction Following	—Unverified	0
Enhanced Motion-Text Alignment for Image-to-Video Transfer Learning	Jan 1, 2024	Transfer LearningVideo Understanding	—Unverified	0
Video-GroundingDINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding	Dec 31, 2023	Spatio-Temporal Video GroundingVideo Grounding	—Unverified	0
No More Shortcuts: Realizing the Potential of Temporal Self-Supervision	Dec 20, 2023	Action ClassificationAttribute	—Unverified	0
Text-Conditioned Resampler For Long Form Video Understanding	Dec 19, 2023	EgoSchemaForm	—Unverified	0
Learning Object State Changes in Videos: An Open-World Perspective	Dec 19, 2023	Video Understanding	—Unverified	0
Artificial intelligence optical hardware empowers high-resolution hyperspectral video understanding at 1.2 Tb/s	Dec 17, 2023	Semantic SegmentationVideo Semantic Segmentation	—Unverified	0

Show:10 25 50

← PrevPage 79 of 115Next →

No leaderboard results yet.