SOTAVerified

Dense Captioning

Papers

Showing 6169 of 69 papers

TitleStatusHype
Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos0
A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes0
Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 20190
UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding0
Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation0
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs0
Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition0
FlexCap: Describe Anything in Images in Controllable Detail0
Fooling Vision and Language Models Despite Localization and Attention Mechanism0
Show:102550
← PrevPage 7 of 7Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ControlCapmAP18.2Unverified
2GRiT (ViT-B)mAP15.5Unverified
3CAG-NetmAP10.5Unverified
4FCLNmAP5.4Unverified