LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for Remote Sensing Image-Text Retrival Mar 16, 2024 Caption Generation Image-text Retrieval
— Unverified 0Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text Retrieval Mar 15, 2024 AudioCaps Contrastive Learning
— Unverified 0CLIP the Bias: How Useful is Balancing Data in Multimodal Learning? Mar 7, 2024 Image to text Image-to-Text Retrieval
— Unverified 0Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval Feb 26, 2024 Retrieval Text Retrieval
— Unverified 0Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings Feb 26, 2024 Contrastive Learning Multi-Task Learning
— Unverified 0MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning Feb 21, 2024 Retrieval Text Generation
— Unverified 0PIRB: A Comprehensive Benchmark of Polish Dense and Hybrid Text Retrieval Methods Feb 20, 2024 Information Retrieval Knowledge Distillation
— Unverified 0Multimodal Learned Sparse Retrieval for Image Suggestion Feb 12, 2024 Image Captioning Retrieval
— Unverified 0Video Editing for Video Retrieval Feb 4, 2024 Retrieval Text Retrieval
— Unverified 0Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning Jan 30, 2024 Diversity Image-text Retrieval
Code Code Available 0Enhancing Image-Text Matching with Adaptive Feature Aggregation Jan 18, 2024 Image-text matching Image-text Retrieval
Code Code Available 0SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment Jan 4, 2024 Image Captioning image-classification
— Unverified 0BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving Jan 2, 2024 Autonomous Driving Caption Generation
— Unverified 0Building Vision-Language Models on Solid Foundations with Masked Distillation Jan 1, 2024 Contrastive Learning Knowledge Distillation
— Unverified 0OTE: Exploring Accurate Scene Text Recognition Using One Token Jan 1, 2024 Decoder Scene Text Recognition
Code Code Available 0Accept the Modality Gap: An Exploration in the Hyperbolic Space Jan 1, 2024 Image to text Image-to-Text Retrieval
— Unverified 0Filter & Align: Leveraging Human Knowledge to Curate Image-Text Data Dec 11, 2023 Image Captioning Image-text Retrieval
— Unverified 0Leveraging Generative Language Models for Weakly Supervised Sentence Component Analysis in Video-Language Joint Learning Dec 10, 2023 Language Modeling Language Modelling
— Unverified 0PEFA: Parameter-Free Adapters for Large-scale Embedding-based Retrieval Models Dec 5, 2023 Retrieval Text Retrieval
— Unverified 0LightCLIP: Learning Multi-Level Interaction for Lightweight Vision-Language Models Dec 1, 2023 image-classification Image Classification
— Unverified 0IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers Nov 27, 2023 Caption Generation Image-text Retrieval
— Unverified 0Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images Nov 23, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 0Towards Robust Text Retrieval with Progressive Learning Nov 20, 2023 Machine Reading Comprehension Question Answering
Code Code Available 0Text Retrieval with Multi-Stage Re-Ranking Models Nov 14, 2023 Language Modeling Language Modelling
Code Code Available 0Noisy Pair Corrector for Dense Retrieval Nov 7, 2023 Code Search Retrieval
— Unverified 0A New Fine-grained Alignment Method for Image-text Matching Nov 3, 2023 Image-text matching Image-text Retrieval
— Unverified 0FLAP: Fast Language-Audio Pre-training Nov 2, 2023 AudioCaps Contrastive Learning
— Unverified 0Harvest Video Foundation Models via Efficient Post-Pretraining Oct 30, 2023 Question Answering Text Retrieval
— Unverified 0MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient image-text retrieval Oct 30, 2023 cross-modal alignment Image-text Retrieval
— Unverified 0End-to-End Autoregressive Retrieval via Bootstrapping for Smart Reply Systems Oct 29, 2023 Diversity Retrieval
— Unverified 0SILC: Improving Vision Language Pretraining with Self-Distillation Oct 20, 2023 Classification Contrastive Learning
— Unverified 0Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning Oct 12, 2023 Image Captioning Image-text Retrieval
— Unverified 0On Using GUI Interaction Data to Improve Text Retrieval-based Bug Localization Oct 12, 2023 Information Retrieval Retrieval
Code Code Available 0Direction-Oriented Visual-semantic Embedding Model for Remote Sensing Image-text Retrieval Oct 12, 2023 Cross-Modal Retrieval Image-text Retrieval
— Unverified 0Policy-Gradient Training of Language Models for Ranking Oct 6, 2023 Decision Making Domain Generalization
— Unverified 0Constructing Image-Text Pair Dataset from Books Oct 3, 2023 Image-text Retrieval Optical Character Recognition (OCR)
— Unverified 0Implicit Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis Sep 21, 2023 Cross-Modal Retrieval Image Captioning
Code Code Available 0Uncertainty-Aware Alignment Network for Cross-Domain Video-Text Retrieval Sep 21, 2023 Domain Adaptation Retrieval
— Unverified 0Uncertainty-Aware Alignment Network for Cross-Domain Video-Text Retrieval Sep 21, 2023 Domain Adaptation Retrieval
— Unverified 0Enhancing Open-Domain Table Question Answering via Syntax- and Structure-aware Dense Retrieval Sep 19, 2023 Question Answering Retrieval
Code Code Available 0Dynamic Visual Semantic Sub-Embeddings and Fast Re-Ranking Sep 15, 2023 Image-text matching Re-Ranking
— Unverified 0Dual Relation Alignment for Composed Image Retrieval Sep 5, 2023 Image Retrieval Image-text Retrieval
— Unverified 0MultiWay-Adapater: Adapting large-scale multi-modal models for scalable image-text retrieval Sep 4, 2023 Image-text Retrieval Retrieval
Code Code Available 0Contrastive Feature Masking Open-Vocabulary Vision Transformer Sep 2, 2023 Contrastive Learning Image-text Retrieval
— Unverified 0Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval? Aug 29, 2023 AudioCaps Audio captioning
— Unverified 0DLIP: Distilling Language-Image Pre-training Aug 24, 2023 Image Captioning Image-text Retrieval
— Unverified 0EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE Aug 23, 2023 Image-text matching Image-text Retrieval
— Unverified 0Hybrid Retrieval and Multi-stage Text Ranking Solution at TREC 2022 Deep Learning Track Aug 23, 2023 Document Ranking Language Modeling
— Unverified 0Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks Aug 13, 2023 Contrastive Learning image-classification
— Unverified 0Embedding-based Retrieval with LLM for Effective Agriculture Information Extracting from Unstructured Data Aug 6, 2023 Language Modeling Language Modelling
— Unverified 0