SOTAVerified

Video Recognition

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Papers

Showing 51100 of 307 papers

TitleStatusHype
MAR: Masked Autoencoders for Efficient Action RecognitionCode1
In Defense of Image Pre-Training for Spatiotemporal RecognitionCode1
Long Movie Clip Classification with State-Space Video ModelsCode1
Group Contextualization for Video RecognitionCode1
Fast Differentiable Matrix Square Root and Inverse Square RootCode1
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video RecognitionCode1
OCSampler: Compressing Videos to One Clip with Single-step SamplingCode1
Glance and Focus Networks for Dynamic Visual RecognitionCode1
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video RecognitionCode1
DualFormer: Local-Global Stratified Transformer for Efficient Video RecognitionCode1
MViTv2: Improved Multiscale Vision Transformers for Classification and DetectionCode1
Pooling by Sliced-Wasserstein EmbeddingCode1
TokenLearner: Adaptive Space-Time Tokenization for VideosCode1
Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-LearningCode1
Efficient Video Transformers with Spatial-Temporal Token SelectionCode1
Attacking Video Recognition Models with Bullet-Screen CommentsCode1
Temporal-attentive Covariance Pooling Networks for Video RecognitionCode1
Boosting the Transferability of Video Adversarial Examples via Temporal TranslationCode1
Unsupervised 3D Pose Estimation for Hierarchical Dance Video RecognitionCode1
Dynamic Network Quantization for Efficient Video InferenceCode1
Can An Image Classifier Suffice For Action Recognition?Code1
Towards Long-Form Video UnderstandingCode1
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?Code1
Self-supervised Video Representation Learning with Cross-Stream Prototypical ContrastingCode1
PyKale: Knowledge-Aware Machine Learning from Multiple Sources in PythonCode1
Space-time Mixing Attention for Video TransformerCode1
Continual 3D Convolutional Neural Networks for Real-time Processing of VideosCode1
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation LearningCode1
Sharing Pain: Using Pain Domain Transfer for Video Recognition of Low Grade Orthopedic Pain in HorsesCode1
AdaMML: Adaptive Multi-Modal Learning for Efficient Video RecognitionCode1
Adaptive Focus for Efficient Video RecognitionCode1
VideoLT: Large-scale Long-tailed Video RecognitionCode1
FrameExit: Conditional Early Exiting for Efficient Video RecognitionCode1
Multiscale Vision TransformersCode1
Visual Semantic Role Labeling for Video UnderstandingCode1
Learning Versatile Neural Architectures by Propagating Network CodesCode1
MoViNets: Mobile Video Networks for Efficient Video RecognitionCode1
PatchNet -- Short-range Template Matching for Efficient Video ProcessingCode1
Piano Skills AssessmentCode1
MVFNet: Multi-View Fusion Network for Efficient Video RecognitionCode1
Learning Equivariant RepresentationsCode1
Depth Guided Adaptive Meta-Fusion Network for Few-shot Video RecognitionCode1
Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video ProcessingCode1
Learning Temporally Invariant and Localizable Features via Data Augmentation for Video RecognitionCode1
Self-supervised Video Representation Learning Using Inter-intra Contrastive FrameworkCode1
RubiksNet: Learnable 3D-Shift for Efficient Video Action RecognitionCode1
Adversarial Bipartite Graph Learning for Video Domain AdaptationCode1
Generalized Few-Shot Video Classification with Video Retrieval and Feature GenerationCode1
Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual RecognitionCode1
Video Panoptic SegmentationCode1
Show:102550
← PrevPage 2 of 7Next →

No leaderboard results yet.