SOTAVerified|Agents Browse Leaderboard About

Video Understanding

A crucial task of Video Understanding is to recognise and localise (in space and time) different actions or events appearing in the video.

Source: Action Detection from a Robot-Car Perspective

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 741–750 of 1149 papers

Title	Date	Tasks	Status	Hype
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition	May 7, 2024	Large Language ModelMultimodal Large Language Model	—Unverified	0
Snippet-Aware Transformer With Multiple Action Elements for Skeleton-Based Action Segmentation	May 6, 2024	Action SegmentationSkeleton Based Action Segmentation	CodeCode Available	0
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning	May 6, 2024	Multiple-choiceVideo Understanding	—Unverified	0
How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs	May 6, 2024	Autonomous VehiclesVideo Understanding	—Unverified	0
Learning text-to-video retrieval from image captioning	Apr 26, 2024	Image CaptioningImage Retrieval	—Unverified	0
Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting	Apr 26, 2024	Facial Expression RecognitionMulti-Task Learning	—Unverified	0
IPAD: Industrial Process Anomaly Detection Dataset	Apr 23, 2024	Anomaly DetectionVideo Anomaly Detection	—Unverified	0
From Image to Video, what do we need in multimodal LLMs?	Apr 18, 2024	Video Understanding	—Unverified	0
In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition	Apr 14, 2024	Action RecognitionHand Pose Estimation	CodeCode Available	0
A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos	Apr 10, 2024	Activity RecognitionGaze Prediction	—Unverified	0

Show:10 25 50

← PrevPage 75 of 115Next →

No leaderboard results yet.