SOTAVerified

OPCap:Object-aware Prompting Captioning

2024-11-27Unverified0· sign in to hype

Feiyang Huang

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In the field of image captioning, the phenomenon where missing or nonexistent objects are used to explain an image is referred to as object bias (or hallucination). To mitigate this issue, we propose a target-aware prompting strategy. This method first extracts object labels and their spatial information from the image using an object detector. Then, an attribute predictor further refines the semantic features of the objects. These refined features are subsequently integrated and fed into the decoder, enhancing the model's understanding of the image context. Experimental results on the COCO and nocaps datasets demonstrate that OPCap effectively mitigates hallucination and significantly improves the quality of generated captions.

Tasks

Reproductions