SOTAVerified

Video Recognition

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Papers

Showing 51100 of 307 papers

TitleStatusHype
ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to VideoCode1
Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer LearningCode1
Phase-Specific Augmented Reality Guidance for Microscopic Cataract Surgery Using Long-Short Spatiotemporal Aggregation Transformer0
Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving0
Eventful Transformers: Leveraging Temporal Redundancy in Vision TransformersCode1
Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video RecognitionCode0
Audio-Visual Class-Incremental LearningCode1
Temporal-Distributed Backdoor Attack Against Video Based Action Recognition0
Audio-Visual Glance Network for Efficient Video Recognition0
Helping Hands: An Object-Aware Ego-Centric Video Recognition ModelCode1
Orthogonal Temporal Interpolation for Zero-Shot Video RecognitionCode0
On the Importance of Spatial Relations for Few-shot Action Recognition0
View while Moving: Efficient Video Recognition in Long-untrimmed Videos0
Prune Spatio-temporal Tokens by Semantic-aware Temporal AccumulationCode1
What Can Simple Arithmetic Operations Do for Temporal Modeling?Code1
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action RecognitionCode1
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible AdapterCode0
Enhanced Multimodal Representation Learning with Cross-modal KD0
A two-way translation system of Chinese sign language based on computer vision0
Hiera: A Hierarchical Vision Transformer without the Bells-and-WhistlesCode0
Spatiotemporal Attention-based Semantic Compression for Real-time Video Recognition0
Inter-frame Accelerate Attack against Video Interpolation Models0
Multi-object Video Generation from Single Frame Layouts0
Implicit Temporal Modeling with Learnable Alignment for Video RecognitionCode1
Use Your Head: Improving Long-Tail Video RecognitionCode0
Frame Flexible NetworkCode1
The effectiveness of MAE pre-pretraining for billion-scale pretrainingCode1
Efficient Decision-based Black-box Patch Attacks on Video Recognition0
Video Action Recognition with Attentive Semantic Units0
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language KnowledgeCode1
Making Vision Transformers Efficient from A Token Sparsification ViewCode1
MRET: Multi-resolution Transformer for Video Quality Assessment0
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video RecognitionCode0
Video4MRI: An Empirical Study on Brain Magnetic Resonance Image Analytics with CNN-based Video Classification Frameworks0
Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight OptimizationCode1
Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge TransferringCode1
Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus on VideosCode0
Tiny Updater: Towards Efficient Neural Network-Driven Software UpdatingCode0
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language ModelsCode2
Efficient Movie Scene Detection using State-Space TransformersCode1
Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition0
VLG: General Video Recognition with Web Textual KnowledgeCode1
SVFormer: Semi-supervised Video Transformer for Action RecognitionCode1
Look More but Care Less in Video RecognitionCode1
Temporal superimposed crossover module for effective continuous sign languageCode0
Cluster and Aggregate: Face Recognition with Large Probe SetCode1
Towards a Unified View on Visual Parameter-Efficient Transfer LearningCode1
REST: REtrieve & Self-Train for generative action recognition0
AdaFocusV3: On Unified Spatial-temporal Dynamic Video RecognitionCode1
Rethinking Resolution in the Context of Efficient Video RecognitionCode1
Show:102550
← PrevPage 2 of 7Next →

No leaderboard results yet.