| PerLA: Perceptive 3D Language Assistant | Nov 29, 2024 | Dense CaptioningGraph Neural Network | CodeCode Available | 1 |
| 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds | Jan 1, 2022 | 3D dense captioningAttribute | —Unverified | 0 |
| 3D Scene Graph Guided Vision-Language Pre-training | Nov 27, 2024 | 3D dense captioning3D visual grounding | —Unverified | 0 |
| 3D Spatial Understanding in MLLMs: Disambiguation and Evaluation | Dec 9, 2024 | 3D dense captioning3D visual grounding | —Unverified | 0 |
| A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes | Mar 12, 2024 | 3D dense captioningDense Captioning | —Unverified | 0 |
| Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos | Jul 11, 2019 | Dense CaptioningDense Video Captioning | —Unverified | 0 |
| Best Vision Technologies Submission to ActivityNet Challenge 2018-Task: Dense-Captioning Events in Videos | Jun 25, 2018 | Dense CaptioningOptical Flow Estimation | —Unverified | 0 |
| Bi-directional Contextual Attention for 3D Dense Captioning | Aug 13, 2024 | 3D dense captioningAttribute | —Unverified | 0 |
| CapDet: Unifying Dense Captioning and Open-World Detection Pretraining | Mar 4, 2023 | Dense Captioning | —Unverified | 0 |
| CapOnImage: Context-driven Dense-Captioning on Image | Apr 27, 2022 | Dense CaptioningDiversity | —Unverified | 0 |
| Complete 3d relationships extraction modality alignment network for 3d dense captioning | Aug 1, 2024 | 3D dense captioning3D Object Detection | —Unverified | 0 |
| Context and Attribute Grounded Dense Captioning | Apr 2, 2019 | AttributeDense Captioning | —Unverified | 0 |
| Contextual Modeling for 3D Dense Captioning on Point Clouds | Oct 8, 2022 | 3D dense captioningDense Captioning | —Unverified | 0 |
| D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding | Dec 2, 2021 | 3D dense captioning3D visual grounding | —Unverified | 0 |
| Dense Procedure Captioning in Narrated Instructional Videos | Jul 1, 2019 | Dense Captioning | —Unverified | 0 |
| Describing image focused in cognitive and visual details for visually impaired people: An approach to generating inclusive paragraphs | Feb 10, 2022 | Dense CaptioningImage Captioning | —Unverified | 0 |
| DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection | Apr 14, 2024 | Dense CaptioningLanguage Modelling | —Unverified | 0 |
| Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs | Jun 5, 2025 | cross-modal alignmentDense Captioning | —Unverified | 0 |
| Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition | Mar 19, 2024 | Dense CaptioningImage Captioning | —Unverified | 0 |
| FlexCap: Describe Anything in Images in Controllable Detail | Mar 18, 2024 | AttributeDense Captioning | —Unverified | 0 |
| Fooling Vision and Language Models Despite Localization and Attention Mechanism | Sep 25, 2017 | Dense CaptioningNatural Language Understanding | —Unverified | 0 |
| Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions | Jul 9, 2024 | Dense Captioningobject-detection | —Unverified | 0 |
| Hint-AD: Holistically Aligned Interpretability in End-to-End Autonomous Driving | Sep 10, 2024 | 3D dense captioningAutonomous Driving | —Unverified | 0 |
| Improving Diversity and Reducing Redundancy in Paragraph Captions | Jul 19, 2020 | DecoderDense Captioning | —Unverified | 0 |
| See It All: Contextualized Late Aggregation for 3D Dense Captioning | Aug 14, 2024 | 3D dense captioningAll | —Unverified | 0 |