SOTAVerified

Dense Captioning

Papers

Showing 5169 of 69 papers

TitleStatusHype
RUC+CMU: System Report for Dense Captioning Events in Videos0
SAVCHOI: Detecting Suspicious Activities using Dense Video Captioning with Human Object Interactions0
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans0
Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning0
See It All: Contextualized Late Aggregation for 3D Dense Captioning0
Semantic-Aware Pretraining for Dense Video Captioning0
Bi-directional Contextual Attention for 3D Dense Captioning0
Best Vision Technologies Submission to ActivityNet Challenge 2018-Task: Dense-Captioning Events in Videos0
Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning0
3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds0
Activitynet 2019 Task 3: Exploring Contexts for Dense Captioning Events in Videos0
A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes0
Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 20190
UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding0
Visually Grounded Word Embeddings and Richer Visual Features for Improving Multimodal Neural Machine Translation0
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs0
Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition0
FlexCap: Describe Anything in Images in Controllable Detail0
Fooling Vision and Language Models Despite Localization and Attention Mechanism0
Show:102550
← PrevPage 3 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ControlCapmAP18.2Unverified
2GRiT (ViT-B)mAP15.5Unverified
3CAG-NetmAP10.5Unverified
4FCLNmAP5.4Unverified