SOTAVerified

Video Captioning

Video Captioning is a task of automatic captioning a video by understanding the action and event in the video which can help in the retrieval of the video efficiently through text.

Source: NITS-VC System for VATEX Video Captioning Challenge 2020

Papers

Showing 251300 of 473 papers

TitleStatusHype
Motion Guided Region Message Passing for Video Captioning0
Move Forward and Tell: A Progressive Generator of Video Descriptions0
Evaluation of Automatic Video Captioning Using Direct Assessment0
Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization0
End-to-end Generative Pretraining for Multimodal Video Captioning0
A Review of Deep Learning for Video Captioning0
MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in Turkish0
A Restricted Visual Turing Test for Deep Scene and Event Understanding0
End-to-end Dense Video Captioning as Sequence Generation0
Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos0
End-to-end Dense Video Captioning as Sequence Generation0
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering0
Multi-modal Dependency Tree for Video Captioning0
Multi-Modal interpretable automatic video captioning0
Multimodal Machine Learning: Integrating Language, Vision and Speech0
Multimodal Machine Translation through Visuals and Speech0
Multimodal Memory Modelling for Video Captioning0
Encoder-Decoder Based Long Short-Term Memory (LSTM) Model for Video Captioning0
Multimodal Semantic Attention Network for Video Captioning0
Multi-Task Video Captioning with Video and Entailment Generation0
Vatex Video Captioning Challenge 2020: Multi-View Features and Hybrid Reward Strategies for Video Captioning0
MUTT: Metric Unit TesTing for Language Generation Tasks0
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners0
Empirical Autopsy of Deep Video Captioning Frameworks0
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative0
Nepali Video Captioning using CNN-RNN Architecture0
E-MMAD: Multimodal Advertising Caption Generation Based on Structured Information0
NITS-VC System for VATEX Video Captioning Challenge 20200
NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning0
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos0
Normalized and Geometry-Aware Self-Attention Network for Image Captioning0
Not All Words are Equal: Video-specific Information Loss for Video Captioning0
O2NA: An Object-Oriented Non-Autoregressive Approach for Controllable Video Captioning0
Object-aware Aggregation with Bidirectional Temporal Graph for Video Captioning0
OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement0
Object Relational Graph with Teacher-Recommended Learning for Video Captioning0
DVCFlow: Modeling Information Flow Towards Human-like Video Captioning0
Vision and Language: from Visual Perception to Content Creation0
Dual-Level Decoupled Transformer for Video Captioning0
OmniVL:One Foundation Model for Image-Language and Video-Language Tasks0
On Flow Profile Image for Video Representation0
On Scaling Up a Multilingual Vision and Language Model0
Open-book Video Captioning with Retrieve-Copy-Generate Network0
Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers0
An Integrated Approach for Video Captioning and Applications0
Visual-aware Attention Dual-stream Decoder for Video Captioning0
Diverse Video Captioning Through Latent Variable Expansion0
Discourse Analysis for Evaluating Coherence in Video Paragraph Captions0
Directed Domain Fine-Tuning: Tailoring Separate Modalities for Specific Training Tasks0
PIC 4th Challenge: Semantic-Assisted Multi-Feature Encoding and Multi-Head Decoding for Dense Video Captioning0
Show:102550
← PrevPage 6 of 10Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1mPLUG-2CIDEr80Unverified
2VASTCIDEr78Unverified
3GIT2CIDEr75.9Unverified
4VLABCIDEr74.9Unverified
5COSACIDEr74.7Unverified
6VALORCIDEr74Unverified
7MaMMUT (ours)CIDEr73.6Unverified
8VideoCoCaCIDEr73.2Unverified
9RTQCIDEr69.3Unverified
10HowToCaptionCIDEr65.3Unverified
#ModelMetricClaimedVerifiedStatus
1MaMMUTCIDEr195.6Unverified
2VLABCIDEr179.8Unverified
3COSACIDEr178.5Unverified
4VALORCIDEr178.5Unverified
5mPLUG-2CIDEr165.8Unverified
6HowToCaptionCIDEr154.2Unverified
7HiTeACIDEr146.9Unverified
8Vid2SeqCIDEr146.2Unverified
9VIOLETv2CIDEr139.2Unverified
10RTQCIDEr123.4Unverified
#ModelMetricClaimedVerifiedStatus
1VASTBLEU-418.2Unverified
2UniVL + MELTRBLEU-417.92Unverified
3UniVLBLEU-417.35Unverified
4VideoCoCaBLEU-414.2Unverified
5VLMBLEU-412.27Unverified
6E2vidD6-MASSvid-BiDBLEU-412.04Unverified
7TextKGBLEU-411.7Unverified
8COOTBLEU-411.3Unverified
9COSABLEU-410.1Unverified
10HowToCaptionBLEU-48.8Unverified
#ModelMetricClaimedVerifiedStatus
1VALORBLEU-445.6Unverified
2VASTBLEU-445Unverified
3COSABLEU-443.7Unverified
4VideoCoCaBLEU-439.7Unverified
5IcoCap (ViT-B/16)BLEU-437.4Unverified
6IcoCap (ViT-B/32)BLEU-436.9Unverified
7VASTA (Kinetics-backbone)BLEU-436.25Unverified
8CoCap (ViT/L14)BLEU-435.8Unverified
9ORG-TRLBLEU-432.1Unverified
10NITS-VCBLEU-420Unverified
#ModelMetricClaimedVerifiedStatus
1VideoCoCaBLEU414.7Unverified
2VLTinT (ae-test split) C3D/LingBLEU414.5Unverified
3VLCap (ae-test split) - Appearance + LanguageBLEU413.38Unverified
4COOT (ae-test split) - Only Appearance featuresBLEU410.85Unverified
5MART (ae-test split) - Appearance + FlowBLEU410.33Unverified
#ModelMetricClaimedVerifiedStatus
1CENCIDEr49.87Unverified
2GITCIDEr32.43Unverified
3SEM-POSCIDEr26.01Unverified
4AKGNNCIDEr25.9Unverified
#ModelMetricClaimedVerifiedStatus
1CENCIDEr63.51Unverified
2GITCIDEr45.63Unverified
3SEM-POSCIDEr37.16Unverified
4AKGNNCIDEr35.08Unverified
#ModelMetricClaimedVerifiedStatus
1SBD_KeyframeBLEU441.01Unverified
2V+S-Att-basedBLEU436.2Unverified
#ModelMetricClaimedVerifiedStatus
1VASTBLEU-419.9Unverified
2COSABLEU-418.8Unverified
#ModelMetricClaimedVerifiedStatus
1GVTBLEU417.7Unverified
#ModelMetricClaimedVerifiedStatus
1VNS-GRU (Cross-Lingual)BLEU-458.68Unverified
#ModelMetricClaimedVerifiedStatus
1Shot2StoryCIDEr37.4Unverified
#ModelMetricClaimedVerifiedStatus
1Vid2SeqCIDEr120.5Unverified