SOTAVerified

Video Captioning

Video Captioning is a task of automatic captioning a video by understanding the action and event in the video which can help in the retrieval of the video efficiently through text.

Source: NITS-VC System for VATEX Video Captioning Challenge 2020

Papers

Showing 301350 of 473 papers

TitleStatusHype
Multimodal Pretraining for Dense Video CaptioningCode1
Semi-Supervised Learning for Video Captioning0
COOT: Cooperative Hierarchical Transformer for Video-Text Representation LearningCode1
Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications0
Improved Actor Relation Graph based Group Activity RecognitionCode1
TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval0
Video captioning with stacked attention and semantic hard pullCode0
Video Captioning Using Weak Annotation0
Hierarchical memory decoder for visual narrating0
In-Home Daily-Life Captioning Using Radio Signals0
Poet: Product-oriented Video Captioner for E-commerceCode1
SODA: Story Oriented Dense Video Captioning Evaluation FrameworkCode1
Learning to Generate Grounded Visual Captions without Localization SupervisionCode1
Enriching Video Captions With Contextual TextCode0
Pre-training for Video Captioning Challenge 2020 Summary0
Active Learning for Video Description With Cluster-Regularized Ensemble Ranking0
SBAT: Video Captioning with Sparse Boundary-Aware Transformer0
Learning to Discretely Compose Reasoning Module Networks for Video CaptioningCode1
Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation0
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training0
SACT: Self-Aware Multi-Space Feature Composition Transformer for Multinomial Attention for Video Captioning0
Comprehensive Information Integration Modeling Framework for Video TitlingCode1
Dense-Captioning Events in Videos: SYSU Submission to ActivityNet Challenge 2020Code1
Video Moment Localization using Object Evidence and Reverse CaptioningCode1
Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning0
NITS-VC System for VATEX Video Captioning Challenge 20200
Syntax-Aware Action Targeting for Video CaptioningCode1
Screencast Tutorial Video UnderstandingCode0
A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal TransformerCode1
Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding0
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph CaptioningCode1
A Benchmark for Structured Procedural Knowledge Extraction from Cooking VideosCode1
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingCode1
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation0
Normalized and Geometry-Aware Self-Attention Network for Image Captioning0
Multi-modal Dense Video CaptioningCode1
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video CaptioningCode1
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement0
Hierarchical Memory Decoding for Video Captioning0
Object Relational Graph with Teacher-Recommended Learning for Video Captioning0
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and GenerationCode1
Spatio-Temporal Ranked-Attention Networks for Video Captioning0
Delving Deeper into the Decoder for Video CaptioningCode1
Vision and Language: from Visual Perception to Content Creation0
Meaning guided video captioningCode0
Multimodal Machine Translation through Visuals and Speech0
Non-Autoregressive Coarse-to-Fine Video CaptioningCode0
Characterizing the impact of using features extracted from pre-trained models on the quality of video captioning sequence-to-sequence models0
Empirical Autopsy of Deep Video Captioning Frameworks0
Multi-attention Networks for Temporal Localization of Video-level LabelsCode0
Show:102550
← PrevPage 7 of 10Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1mPLUG-2CIDEr80Unverified
2VASTCIDEr78Unverified
3GIT2CIDEr75.9Unverified
4VLABCIDEr74.9Unverified
5COSACIDEr74.7Unverified
6VALORCIDEr74Unverified
7MaMMUT (ours)CIDEr73.6Unverified
8VideoCoCaCIDEr73.2Unverified
9RTQCIDEr69.3Unverified
10HowToCaptionCIDEr65.3Unverified
#ModelMetricClaimedVerifiedStatus
1MaMMUTCIDEr195.6Unverified
2VLABCIDEr179.8Unverified
3COSACIDEr178.5Unverified
4VALORCIDEr178.5Unverified
5mPLUG-2CIDEr165.8Unverified
6HowToCaptionCIDEr154.2Unverified
7HiTeACIDEr146.9Unverified
8Vid2SeqCIDEr146.2Unverified
9VIOLETv2CIDEr139.2Unverified
10RTQCIDEr123.4Unverified
#ModelMetricClaimedVerifiedStatus
1VASTBLEU-418.2Unverified
2UniVL + MELTRBLEU-417.92Unverified
3UniVLBLEU-417.35Unverified
4VideoCoCaBLEU-414.2Unverified
5VLMBLEU-412.27Unverified
6E2vidD6-MASSvid-BiDBLEU-412.04Unverified
7TextKGBLEU-411.7Unverified
8COOTBLEU-411.3Unverified
9COSABLEU-410.1Unverified
10HowToCaptionBLEU-48.8Unverified
#ModelMetricClaimedVerifiedStatus
1VALORBLEU-445.6Unverified
2VASTBLEU-445Unverified
3COSABLEU-443.7Unverified
4VideoCoCaBLEU-439.7Unverified
5IcoCap (ViT-B/16)BLEU-437.4Unverified
6IcoCap (ViT-B/32)BLEU-436.9Unverified
7VASTA (Kinetics-backbone)BLEU-436.25Unverified
8CoCap (ViT/L14)BLEU-435.8Unverified
9ORG-TRLBLEU-432.1Unverified
10NITS-VCBLEU-420Unverified
#ModelMetricClaimedVerifiedStatus
1VideoCoCaBLEU414.7Unverified
2VLTinT (ae-test split) C3D/LingBLEU414.5Unverified
3VLCap (ae-test split) - Appearance + LanguageBLEU413.38Unverified
4COOT (ae-test split) - Only Appearance featuresBLEU410.85Unverified
5MART (ae-test split) - Appearance + FlowBLEU410.33Unverified
#ModelMetricClaimedVerifiedStatus
1CENCIDEr49.87Unverified
2GITCIDEr32.43Unverified
3SEM-POSCIDEr26.01Unverified
4AKGNNCIDEr25.9Unverified
#ModelMetricClaimedVerifiedStatus
1CENCIDEr63.51Unverified
2GITCIDEr45.63Unverified
3SEM-POSCIDEr37.16Unverified
4AKGNNCIDEr35.08Unverified
#ModelMetricClaimedVerifiedStatus
1SBD_KeyframeBLEU441.01Unverified
2V+S-Att-basedBLEU436.2Unverified
#ModelMetricClaimedVerifiedStatus
1VASTBLEU-419.9Unverified
2COSABLEU-418.8Unverified
#ModelMetricClaimedVerifiedStatus
1GVTBLEU417.7Unverified
#ModelMetricClaimedVerifiedStatus
1VNS-GRU (Cross-Lingual)BLEU-458.68Unverified
#ModelMetricClaimedVerifiedStatus
1Shot2StoryCIDEr37.4Unverified
#ModelMetricClaimedVerifiedStatus
1Vid2SeqCIDEr120.5Unverified