SOTAVerified

Dense Captioning

Papers

Showing 4150 of 69 papers

TitleStatusHype
Describing image focused in cognitive and visual details for visually impaired people: An approach to generating inclusive paragraphs0
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection0
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs0
Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition0
FlexCap: Describe Anything in Images in Controllable Detail0
Fooling Vision and Language Models Despite Localization and Attention Mechanism0
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions0
Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving0
Improving Diversity and Reducing Redundancy in Paragraph Captions0
See It All: Contextualized Late Aggregation for 3D Dense Captioning0
Show:102550
← PrevPage 5 of 7Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ControlCapmAP18.2Unverified
2GRiT (ViT-B)mAP15.5Unverified
3CAG-NetmAP10.5Unverified
4FCLNmAP5.4Unverified