SOTAVerified

Video Recognition

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Papers

Showing 125 of 307 papers

TitleStatusHype
DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action RecognitionCode0
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language ModelsCode0
Gameplay Highlights Generation0
Fast Adversarial Training with Weak-to-Strong Spatial-Temporal Consistency in the Frequency Domain on Videos0
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition0
Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering0
BASKET: A Large-Scale Video Dataset for Fine-Grained Skill EstimationCode1
PAVE: Patching and Adapting Video Large Language ModelsCode1
VTD-CLIP: Video-to-Text Discretization via Prompting CLIPCode0
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition0
A Simple and Efficient Baseline for Video Action Recognition0
VideoPure: Diffusion-based Adversarial Purification for Video RecognitionCode0
Action Detail Matters: Refining Video Recognition with Local Action Queries0
DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments0
Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video RecognitionCode2
Standardization Trends on Safety and Trustworthiness Technology for Advanced AI0
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge TransferCode0
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal RepresentationsCode5
A Novel Audio-Visual Information Fusion System for Mental Disorders Detection0
GenRec: Unifying Video Generation and Recognition with Diffusion ModelsCode0
OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature LearningCode1
VideoMamba: Spatio-Temporal Selective State Space ModelCode1
Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case0
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video RecognitionCode0
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD0
Show:102550
← PrevPage 1 of 13Next →

No leaderboard results yet.