SOTAVerified

Video Recognition

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Papers

Showing 101150 of 307 papers

TitleStatusHype
0-MMS: Zero-Shot Multi-Motion Segmentation With A Monocular Event CameraCode1
TAM: Temporal Adaptive Module for Video RecognitionCode1
CatNet: Class Incremental 3D ConvNets for Lifelong Egocentric Gesture RecognitionCode1
Improved Residual Networks for Image and Video RecognitionCode1
Clean-Label Backdoor Attacks on Video Recognition ModelsCode1
V4D:4D Convolutional Neural Networks for Video-level Representation LearningCode1
Over-the-Air Adversarial Flickering Attacks against Video Recognition NetworksCode1
Large Scale Holistic Video UnderstandingCode1
SlowFast Networks for Video RecognitionCode1
TSM: Temporal Shift Module for Efficient Video UnderstandingCode1
Deep Feature Flow for Video RecognitionCode1
Clockwork Convnets for Video Semantic SegmentationCode1
DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action RecognitionCode0
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language ModelsCode0
Gameplay Highlights Generation0
Fast Adversarial Training with Weak-to-Strong Spatial-Temporal Consistency in the Frequency Domain on Videos0
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition0
Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering0
VTD-CLIP: Video-to-Text Discretization via Prompting CLIPCode0
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition0
A Simple and Efficient Baseline for Video Action Recognition0
VideoPure: Diffusion-based Adversarial Purification for Video RecognitionCode0
Action Detail Matters: Refining Video Recognition with Local Action Queries0
DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments0
Standardization Trends on Safety and Trustworthiness Technology for Advanced AI0
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge TransferCode0
A Novel Audio-Visual Information Fusion System for Mental Disorders Detection0
GenRec: Unifying Video Generation and Recognition with Diffusion ModelsCode0
Purification Of Contaminated Convolutional Neural Networks Via Robust Recovery: An Approach with Theoretical Guarantee in One-Hidden-Layer Case0
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video RecognitionCode0
MeMSVD: Long-Range Temporal Structure Capturing Using Incremental SVD0
Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions0
Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios0
Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition0
LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model0
Don't Judge by the Look: Towards Motion Coherent Video RepresentationCode0
Percept, Chat, and then Adapt: Multimodal Knowledge Transfer of Foundation Models for Open-World Video Recognition0
Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video RecognitionCode0
HaltingVT: Adaptive Token Halting Transformer for Efficient Video RecognitionCode0
Motion Guided Token Compression for Efficient Masked Video Modeling0
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification0
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video RecognitionCode0
LogoStyleFool: Vitiating Video Recognition Systems via Logo Style TransferCode0
Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video RecognitionCode0
Object-centric Video Representation for Long-term Action AnticipationCode0
On the Relevance of Temporal Features for Medical Ultrasound Video RecognitionCode0
Phase-Specific Augmented Reality Guidance for Microscopic Cataract Surgery Using Long-Short Spatiotemporal Aggregation Transformer0
Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving0
Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video RecognitionCode0
Temporal-Distributed Backdoor Attack Against Video Based Action Recognition0
Show:102550
← PrevPage 3 of 7Next →

No leaderboard results yet.