SOTAVerified

Dense Captioning

Papers

Showing 2130 of 69 papers

TitleStatusHype
MORE: Multi-Order RElation Mining for Dense Captioning in 3D ScenesCode1
X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense CaptioningCode1
Integrating Visuospatial, Linguistic, and Commonsense Structure into Story VisualizationCode1
Integrating Visuospatial, Linguistic and Commonsense Structure into Story VisualizationCode1
Dense-Captioning Events in Videos: SYSU Submission to ActivityNet Challenge 2020Code1
Dense-Captioning Events in VideosCode1
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs0
3D Spatial Understanding in MLLMs: Disambiguation and Evaluation0
3D Scene Graph Guided Vision-Language Pre-training0
Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving0
Show:102550
← PrevPage 3 of 7Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ControlCapmAP18.2Unverified
2GRiT (ViT-B)mAP15.5Unverified
3CAG-NetmAP10.5Unverified
4FCLNmAP5.4Unverified