SOTAVerified

Zero-Shot Action Recognition

Papers

Showing 5175 of 83 papers

TitleStatusHype
The Role of Video Generation in Enhancing Data-Limited Action Understanding0
Towards Universal Representation for Unseen Action Recognition0
Transductive Universal Transport for Zero-Shot Action Recognition0
Transductive Zero-Shot Action Recognition by Word-Vector Embedding0
Universal Prototype Transport for Zero-Shot Action Recognition and Localization0
VicTR: Video-conditioned Text Representations for Activity Recognition0
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners0
Zero-Shot Action Recognition in Surveillance Videos0
Zero-Shot Action Recognition in Videos: A Survey0
Zero-Shot Action Recognition With Error-Correcting Output Codes0
Multimodal Open-Vocabulary Video Classification via Pre-Trained Vision and Language Models0
Multi-Task Zero-Shot Action Recognition with Prioritised Data Augmentation0
Natural Language Descriptions for Human Activities in Video Streams0
Objects2action: Classifying and localizing actions without any video example0
Reformulating Zero-shot Action Recognition for Multi-label Actions0
REST: REtrieve & Self-Train for generative action recognition0
LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action RecognitionCode0
Learning a Deep Embedding Model for Zero-Shot LearningCode0
Label-Embedding for Image ClassificationCode0
Zero-Shot Action Recognition from Diverse Object-Scene CompositionsCode0
An embarrassingly simple approach to zero-shot learningCode0
End-to-End Semantic Video Transformer for Zero-Shot Action RecognitionCode0
FitCLIP: Refining Large-Scale Pretrained Image-Text Models for Zero-Shot Video Understanding TasksCode0
Orthogonal Temporal Interpolation for Zero-Shot Video RecognitionCode0
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and GenerationCode0
Show:102550
← PrevPage 3 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1OTI(ViT-L/14)Top-1 Accuracy92.8Unverified
2IMP-MoE-LTop-1 Accuracy91.5Unverified
3MOV (ViT-L/14)Top-1 Accuracy87.1Unverified
4VideoCoCaTop-1 Accuracy86.6Unverified
5BIKETop-1 Accuracy86.6Unverified
6Text4VisTop-1 Accuracy85.8Unverified
7TC-CLIPTop-1 Accuracy85.4Unverified
8EVA-CLIP-E/14+Top-1 Accuracy83.1Unverified
9MOV (ViT-B/16)Top-1 Accuracy82.6Unverified
10OSTTop-1 Accuracy79.7Unverified
#ModelMetricClaimedVerifiedStatus
1MOV (ViT-L/14)Top-1 Accuracy64.7Unverified
2OTI(ViT-L/14)Top-1 Accuracy64Unverified
3BIKETop-1 Accuracy61.4Unverified
4MOV (ViT-B/16)Top-1 Accuracy60.8Unverified
5IMP-MoE-LTop-1 Accuracy59.1Unverified
6VideoCoCaTop-1 Accuracy58.7Unverified
7Text4VisTop-1 Accuracy58.4Unverified
8TC-CLIPTop-1 Accuracy56Unverified
9OSTTop-1 Accuracy55.9Unverified
10MAXITop-1 Accuracy52.3Unverified
#ModelMetricClaimedVerifiedStatus
1TC-CLIPTop-1 Accuracy78.1Unverified
2IMP-MoE-LTop-1 Accuracy76.8Unverified
3OSTTop-1 Accuracy75.1Unverified
4MAXITop-1 Accuracy71.6Unverified
5OTI(ViT-L/14)Top-1 Accuracy70.6Unverified
6VideoCoCaTop-1 Accuracy70.1Unverified
7Text4VisTop-1 Accuracy68.9Unverified
8BIKETop-1 Accuracy68.5Unverified
9X-CLIPTop-1 Accuracy65.2Unverified
10LanguageBindTop-1 Accuracy64.1Unverified
#ModelMetricClaimedVerifiedStatus
1SPOTTop-1 Accuracy68.7Unverified
2CLASTERTop-1 Accuracy68.4Unverified
3ER-ZSARTop-1 Accuracy60.2Unverified
4ZSECOCTop-1 Accuracy59.8Unverified
5TS-GCNTop-1 Accuracy56.5Unverified
6SJE(Atrribute)Top-1 Accuracy47.5Unverified
7MTETop-1 Accuracy44.3Unverified
8ESZSLTop-1 Accuracy39.6Unverified
9SJE(Word Embedding)Top-1 Accuracy28.6Unverified
#ModelMetricClaimedVerifiedStatus
1BIKETop-1 Accuracy86.2Unverified
2Text4VisTop-1 Accuracy84.6Unverified
3LoCATe-GATTop-1 Accuracy73.8Unverified
4ResTTop-1 Accuracy32.5Unverified
5E2ETop-1 Accuracy26.6Unverified
#ModelMetricClaimedVerifiedStatus
1MSQNetmAP35.59Unverified
2VideoCoCamAP25.8Unverified
3MAXImAP23.8Unverified
4CLIP-Hitchhiker (ViT-B/16, 32 frames)mAP21.1Unverified
#ModelMetricClaimedVerifiedStatus
1MSQNetAccuracy75.33Unverified