Mindstorms in Natural Language-Based Societies of Mind May 26, 2023 3D Generation Image Captioning
— Unverified 0HAAV: Hierarchical Aggregation of Augmented Views for Image Captioning May 25, 2023 Caption Generation Decoder
— Unverified 0Exploring Diverse In-Context Configurations for Image Captioning May 24, 2023 Image Captioning In-Context Learning
Code Code Available 1EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought May 24, 2023 Image Captioning Language Modelling
— Unverified 0An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics May 24, 2023 Image Captioning Negation
Code Code Available 0Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models May 24, 2023 document understanding Image Captioning
Code Code Available 1Gender Biases in Automatic Evaluation Metrics for Image Captioning May 24, 2023 Fairness Image Captioning
Code Code Available 0Exploring Affordance and Situated Meaning in Image Captions: A Multimodal Analysis May 24, 2023 Image Captioning Natural Language Understanding
— Unverified 0Alt-Text with Context: Improving Accessibility for Images on Twitter May 24, 2023 Descriptive Image Captioning
— Unverified 0Text encoders bottleneck compositionality in contrastive vision-language models May 24, 2023 Attribute Image Captioning
Code Code Available 1PIC-XAI: Post-hoc Image Captioning Explanation using Segmentation May 23, 2023 Explainable artificial intelligence Explainable Artificial Intelligence (XAI)
Code Code Available 0MemeCap: A Dataset for Captioning and Interpreting Memes May 23, 2023 Image Captioning Meme Captioning
Code Code Available 1Text-based Person Search without Parallel Image-Text Data May 22, 2023 Image Captioning Language Modeling
— Unverified 0What Makes for Good Visual Tokenizers for Large Language Models? May 20, 2023 Image Captioning Object Counting
Code Code Available 1A request for clarity over the End of Sequence token in the Self-Critical Sequence Training May 20, 2023 Image Captioning Sentence
Code Code Available 0DiffCap: Exploring Continuous Diffusion on Image Captioning May 20, 2023 Caption Generation Diversity
— Unverified 0Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted Alignment May 20, 2023 Image Captioning Translation
— Unverified 0Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner May 19, 2023 Dense Captioning Image Captioning
Code Code Available 1Brain Captioning: Decoding human brain activity into images and text May 19, 2023 Brain Decoding Depth Estimation
Code Code Available 1Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models May 15, 2023 3D Object Detection Image Captioning
Code Code Available 1Semantic Composition in Visually Grounded Language Models May 15, 2023 Image Captioning Inductive Bias
— Unverified 0IMAGINATOR: Pre-Trained Image+Text Joint Embeddings using Word-Level Grounding of Images May 12, 2023 Hyperparameter Optimization Image Captioning
Code Code Available 0Simple Token-Level Confidence Improves Caption Correctness May 11, 2023 Hallucination Image Captioning
— Unverified 0Towards L-System Captioning for Tree Reconstruction May 10, 2023 Image Captioning
— Unverified 0InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation May 10, 2023 Benchmarking Image Captioning
Code Code Available 1WikiWeb2M: A Page-Level Multimodal Wikipedia Dataset May 9, 2023 Articles Image Captioning
Code Code Available 3Vision-Language Models in Remote Sensing: Current Progress and Future Trends May 9, 2023 Image Captioning Image Generation
Code Code Available 1Exploiting Pseudo Image Captions for Multimodal Summarization May 9, 2023 Common Sense Reasoning Contrastive Learning
— Unverified 0UIT-OpenViIC: A Novel Benchmark for Evaluating Image Captioning in Vietnamese May 7, 2023 Image Captioning Vietnamese Image Captioning
— Unverified 0The Role of Data Curation in Image Captioning May 5, 2023 Few-Shot Learning Image Captioning
Code Code Available 0A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding May 5, 2023 Articles Image Captioning
Code Code Available 0Image Captioners Sometimes Tell More Than Images They See May 4, 2023 Descriptive Image Captioning
— Unverified 0Caption Anything: Interactive Image Description with Diverse Multimodal Controls May 4, 2023 controllable image captioning Image Captioning
Code Code Available 3Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime May 3, 2023 Image Captioning Question Answering
— Unverified 0Multimodal Data Augmentation for Image Captioning using Diffusion Models May 3, 2023 Data Augmentation Image Captioning
Code Code Available 0Fairness in AI Systems: Mitigating gender bias from language-vision models May 3, 2023 Fairness Image Captioning
— Unverified 0Transforming Visual Scene Graphs to Image Captions May 3, 2023 Attribute Decoder
Code Code Available 1Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment Apr 28, 2023 Data Augmentation Image Captioning
— Unverified 0Learning Human-Human Interactions in Images from Weak Textual Supervision Apr 27, 2023 Human-Human Interaction Recognition Image Captioning
— Unverified 0From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping Apr 26, 2023 Decoder Image Captioning
Code Code Available 1TTIDA: Controllable Generative Data Augmentation via Text-to-Text and Text-to-Image Models Apr 18, 2023 Data Augmentation Diversity
Code Code Available 0VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset Apr 17, 2023 Audio captioning Audio-Video Question Answering (AVQA)
Code Code Available 2A-CAP: Anticipation Captioning with Commonsense Knowledge Apr 13, 2023 Image Captioning Language Modeling
— Unverified 0Advancing Medical Imaging with Language Models: A Journey from N-grams to ChatGPT Apr 11, 2023 Diagnostic Image Captioning
— Unverified 0Boosting Cross-task Transferability of Adversarial Patches with Visual Relations Apr 11, 2023 Image Captioning Object Recognition
— Unverified 0ImageCaptioner^2: Image Captioner for Image Captioning Bias Amplification Assessment Apr 10, 2023 Image Captioning
— Unverified 0Model-Agnostic Gender Debiased Image Captioning Apr 7, 2023 Image Captioning model
Code Code Available 0Uncurated Image-Text Datasets: Shedding Light on Demographic Bias Apr 6, 2023 Image Captioning Image Generation
Code Code Available 1Towards Self-Explainability of Deep Neural Networks with Heatmap Captioning and Large-Language Models Apr 5, 2023 Explainable Artificial Intelligence (XAI) Image Captioning
— Unverified 0Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data Apr 4, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0