SOTAVerified

Dense Captioning

Papers

Showing 2130 of 69 papers

TitleStatusHype
TOD3Cap: Towards 3D Dense Captioning in Outdoor ScenesCode2
Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition0
FlexCap: Describe Anything in Images in Controllable Detail0
Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning0
A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes0
ControlCap: Controllable Region-level CaptioningCode2
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and PlanningCode3
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video UnderstandingCode2
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and PlanningCode2
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense CaptioningCode1
Show:102550
← PrevPage 3 of 7Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ControlCapmAP18.2Unverified
2GRiT (ViT-B)mAP15.5Unverified
3CAG-NetmAP10.5Unverified
4FCLNmAP5.4Unverified