SOTAVerified

Image Paragraph Captioning

Image paragraph captioning involves generating a detailed, multi-sentence description of the content of an image.

Papers

Showing 117 of 17 papers

TitleStatusHype
VLIS: Unimodal Language Models Guide Multimodal Language GenerationCode1
Bypass Network for Semantics Driven Image Paragraph Captioning0
Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation0
Diverse and Coherent Paragraph Generation from Images0
Dual-CNN: A Convolutional language decoder for paragraph image captioning0
Enhancing image captioning with depth information using a Transformer-based framework0
Hierarchical Scene Graph Encoder-Decoder for Image Paragraph Captioning0
Improving Diversity and Reducing Redundancy in Paragraph Captions0
Interactive Key-Value Memory-augmented Attention for Image Paragraph Captioning0
Look Deeper See Richer: Depth-aware Image Paragraph Captioning0
When an Image Tells a Story: The Role of Visual and Semantic Information for Generating Paragraph Descriptions0
Recurrent Topic-Transition GAN for Visual Paragraph Generation0
Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning0
Training for Diversity in Image Paragraph CaptioningCode0
A Hierarchical Approach for Generating Descriptive Image ParagraphsCode0
Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph CaptioningCode0
Context-Aware Visual Policy Network for Fine-Grained Image CaptioningCode0
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1HSGED(SLL)BLEU-411.26Unverified
2SCST training, w/ rep. penaltyBLEU-410.58Unverified
3IMAPBLEU-410.29Unverified
4CAE-LSTMBLEU-49.67Unverified
5Diverse and Coherent Paragraph Generation from ImagesBLEU-49.43Unverified
6RTT-GAN (Semi + Fully)BLEU-49.21Unverified
7Regions-Hierarchical (ours)BLEU-48.69Unverified
8Dual-CNNBLEU-48.6Unverified
9Depth-aware Attention Model (DAM)BLEU-46.7Unverified
10IMG+LNGBLEU-44.67Unverified