SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 81–90 of 1149 papers

Title	Date	Tasks	Status	Hype
SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding	Feb 15, 2025	Question AnsweringStreaming video understanding	CodeCode Available	2
AIN: The Arabic INclusive Large Multimodal Model	Jan 31, 2025	document understandingmodel	CodeCode Available	2
TinyLLaVA-Video: A Simple Framework of Small-scale Large Multimodal Models for Video Understanding	Jan 26, 2025	Video Understanding	CodeCode Available	2
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge	Jan 23, 2025	SchedulingStreaming video understanding	CodeCode Available	2
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding	Jan 21, 2025	Video Understanding	CodeCode Available	2
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?	Jan 9, 2025	BenchmarkingVideo Understanding	CodeCode Available	2
Adaptive Keyframe Sampling for Long Video Understanding	Jan 1, 2025	Video Understanding	CodeCode Available	2
Online Video Understanding: OVBench and VideoChat-Online	Dec 31, 2024	Autonomous DrivingQuestion Answering	CodeCode Available	2
FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models	Dec 30, 2024	Question AnsweringToken Reduction	CodeCode Available	2
PruneVid: Visual Token Pruning for Efficient Video Large Language Models	Dec 20, 2024	Video Understanding	CodeCode Available	2

Show:10 25 50

← PrevPage 9 of 115Next →

No leaderboard results yet.