| CapText: Large Language Model-based Caption Generation From Image Context and Description | Jun 1, 2023 | Caption GenerationImage to text | —Unverified | 0 |
| RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignment | May 31, 2023 | Caption GenerationLanguage Modelling | —Unverified | 0 |
| HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning | May 25, 2023 | Caption GenerationDecoder | —Unverified | 0 |
| DiffCap: Exploring Continuous Diffusion on Image Captioning | May 20, 2023 | Caption GenerationDiversity | —Unverified | 0 |
| Efficient Audio Captioning Transformer with Patchout and Text Guidance | Apr 6, 2023 | Audio captioningCaption Generation | —Unverified | 0 |
| Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models | Apr 5, 2023 | Caption GenerationImage Generation | —Unverified | 0 |
| Multi-modal reward for visual relationships-based image captioning | Mar 19, 2023 | Caption GenerationDeep Reinforcement Learning | —Unverified | 0 |
| GNNFormer: A Graph-based Framework for Cytopathology Report Generation | Mar 17, 2023 | Caption GenerationGraph Neural Network | —Unverified | 0 |
| Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization | Feb 23, 2023 | Abstractive Text SummarizationCaption Generation | CodeCode Available | 0 |
| Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning | Feb 8, 2023 | Caption GenerationDecoder | —Unverified | 0 |