SOTAVerified

Video Recognition

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Papers

Showing 251275 of 307 papers

TitleStatusHype
HaltingVT: Adaptive Token Halting Transformer for Efficient Video RecognitionCode0
PosMLP-Video: Spatial and Temporal Relative Position Encoding for Efficient Video RecognitionCode0
Heuristic Black-box Adversarial Attacks on Video Recognition ModelsCode0
Hiera: A Hierarchical Vision Transformer without the Bells-and-WhistlesCode0
ST-ABN: Visual Explanation Taking into Account Spatio-temporal Information for Video RecognitionCode0
Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video RecognitionCode0
VidConv: A modernized 2D ConvNet for Efficient Video RecognitionCode0
Collaborative Spatio-temporal Feature Learning for Video Action RecognitionCode0
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible AdapterCode0
Inter-intra Variant Dual Representations forSelf-supervised Video RecognitionCode0
Collaborative Spatiotemporal Feature Learning for Video Action RecognitionCode0
Adaptive occlusion sensitivity analysis for visually explaining video recognition networksCode0
A^2-Nets: Double Attention NetworksCode0
QTTNet: Quantized Tensor Train Neural Networks for 3D Object and Video Recognition.Code0
Multi-Modal Multi-Action Video RecognitionCode0
Don't Judge by the Look: Towards Motion Coherent Video RepresentationCode0
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video RecognitionCode0
VTD-CLIP: Video-to-Text Discretization via Prompting CLIPCode0
DriftNet: Aggressive Driving Behavior Classification using 3D EfficientNet ArchitectureCode0
VideoPure: Diffusion-based Adversarial Purification for Video RecognitionCode0
Revisiting 3D ResNets for Video RecognitionCode0
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave ConvolutionCode0
Video Transformer NetworkCode0
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language ModelsCode0
Object-centric Video Representation for Long-term Action AnticipationCode0
Show:102550
← PrevPage 11 of 13Next →

No leaderboard results yet.