SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 751–775 of 1149 papers

Title	Date	Tasks	Status
Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention	Apr 10, 2024	Action AnticipationGraph Neural Network	—Unverified
Koala: Key frame-conditioned long video-LLM	Apr 5, 2024	Action RecognitionQuestion Answering	—Unverified
BioVL-QR: Egocentric Biochemical Vision-and-Language Dataset Using Micro QR Codes	Apr 4, 2024	ObjectVideo Understanding	—Unverified
OW-VISCapTor: Abstractors for Open-World Video Instance Segmentation and Captioning	Apr 4, 2024	DescriptiveDiversity	—Unverified
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding	Apr 2, 2024	Highlight DetectionMoment Retrieval	—Unverified
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding	Mar 31, 2024	Highlight DetectionMoment Retrieval	—Unverified
Instrument-tissue Interaction Detection Framework for Surgical Video Understanding	Mar 30, 2024	Video Understanding	—Unverified
A Unified Framework for Human-centric Point Cloud Video Understanding	Mar 29, 2024	3D Pose EstimationAction Recognition	—Unverified
Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality	Mar 28, 2024	Data AugmentationDiversity	CodeCode Available
Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding	Mar 24, 2024	Dense Video CaptioningTemporal Localization	—Unverified
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding	Mar 21, 2024	Pose EstimationVideo Understanding	CodeCode Available
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding	Mar 18, 2024	EgoSchemaVideo Understanding	—Unverified
Don't Judge by the Look: Towards Motion Coherent Video Representation	Mar 14, 2024	Data AugmentationObject Recognition	CodeCode Available
Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions	Mar 11, 2024	counterfactualVideo Editing	—Unverified
A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives	Mar 5, 2024	Video Understanding	—Unverified
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies	Mar 3, 2024	Text GenerationVideo Understanding	—Unverified
Abductive Ego-View Accident Video Understanding for Safe Driving Perception	Mar 1, 2024	Objectobject-detection	—Unverified
TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning	Feb 29, 2024	Question AnsweringVideo Understanding	—Unverified
LLMs Meet Long Video: Advancing Long Video Question Answering with An Interactive Visual Adapter in LLMs	Feb 21, 2024	Question AnsweringVideo Question Answering	—Unverified
Slot-VLM: SlowFast Slots for Video-Language Modeling	Feb 20, 2024	Language ModelingLanguage Modelling	—Unverified
VideoPrism: A Foundational Visual Encoder for Video Understanding	Feb 20, 2024	Question AnsweringVideo Question Answering	—Unverified
Dynamics Based Neural Encoding with Inter-Intra Region Connectivity	Feb 19, 2024	Video Understanding	—Unverified
Are you Struggling? Dataset and Baselines for Struggle Determination in Assembly Videos	Feb 16, 2024	Decision MakingVideo Understanding	CodeCode Available
Memory Consolidation Enables Long-Context Video Understanding	Feb 8, 2024	EgoSchemaVideo Understanding	—Unverified
A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming	Jan 30, 2024	Video GenerationVideo Understanding	—Unverified

Show:10 25 50

← PrevPage 31 of 46Next →

No leaderboard results yet.