SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 541–550 of 1149 papers

Title	Date	Tasks	Status	Hype
Koala: Key frame-conditioned long video-LLM	Apr 5, 2024	Action RecognitionQuestion Answering	—Unverified	0
BioVL-QR: Egocentric Biochemical Vision-and-Language Dataset Using Micro QR Codes	Apr 4, 2024	ObjectVideo Understanding	—Unverified	0
OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and Captioning	Apr 4, 2024	DescriptiveDiversity	—Unverified	0
LongVLM: Efficient Long Video Understanding via Large Language Models	Apr 4, 2024	Question AnsweringVideo Question Answering	CodeCode Available	2
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens	Apr 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
SnAG: Scalable and Accurate Video Grounding	Apr 2, 2024	Video GroundingVideo Understanding	CodeCode Available	4
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding	Apr 2, 2024	Highlight DetectionMoment Retrieval	—Unverified	0
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding	Mar 31, 2024	Highlight DetectionMoment Retrieval	—Unverified	0
Instrument-tissue Interaction Detection Framework for Surgical Video Understanding	Mar 30, 2024	Video Understanding	—Unverified	0
ST-LLM: Large Language Models Are Effective Temporal Learners	Mar 30, 2024	MVBenchReading Comprehension	CodeCode Available	2

Show:10 25 50

← PrevPage 55 of 115Next →

No leaderboard results yet.