A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models Jan 21, 2025 RAG Retrieval
Code Code Available 7h2oGPT: Democratizing Large Language Models Jun 13, 2023 Chatbot Fairness
Code Code Available 6BM25S: Orders of magnitude faster lexical search via eager sparse scoring Jul 4, 2024 Passage Retrieval Retrieval
Code Code Available 5BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval Jul 16, 2024 Question Answering Retrieval
Code Code Available 5BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Jan 28, 2022 Image Captioning Image-text matching
Code Code Available 5Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers Jul 14, 2022 Retrieval Text Retrieval
Code Code Available 4Multi-label Cluster Discrimination for Visual Representation Learning Jul 24, 2024 Contrastive Learning Image-text Retrieval
Code Code Available 4Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Feb 29, 2024 Retrieval Text Retrieval
Code Code Available 4RETSim: Resilient and Efficient Text Similarity Nov 28, 2023 Adversarial Text Clustering
Code Code Available 4MTEB: Massive Text Embedding Benchmark Oct 13, 2022 Benchmarking Information Retrieval
Code Code Available 4FG-CLIP: Fine-Grained Visual and Textual Alignment May 8, 2025 Image-text Retrieval object-detection
Code Code Available 4LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment Oct 3, 2023 Audio Classification Contrastive Learning
Code Code Available 4ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities May 18, 2023 1 Image, 2*2 Stitchi Action Classification
Code Code Available 3Temporal Working Memory: Query-Guided Segment Refinement for Enhanced Multimodal Understanding Feb 9, 2025 Image Captioning Image-text Retrieval
Code Code Available 3AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation Apr 4, 2023 Cross-Modal Retrieval Image-text Retrieval
Code Code Available 3DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models Feb 8, 2022 Diagnostic Image Captioning
Code Code Available 3Vision-Language Pre-training: Basics, Recent Advances, and Future Trends Oct 17, 2022 Few-Shot Learning Image Captioning
Code Code Available 3M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models Mar 31, 2024 Image-text Retrieval Language Modeling
Code Code Available 3RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing Jun 20, 2023 Cross-Modal Retrieval Image Retrieval
Code Code Available 2ProtT3: Protein-to-Text Generation for Text-based Protein Understanding May 21, 2024 Property Prediction Question Answering
Code Code Available 2RWKV-CLIP: A Robust Vision-Language Representation Learner Jun 11, 2024 Image-text Retrieval Representation Learning
Code Code Available 2One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory May 29, 2025 Contrastive Learning Text Retrieval
Code Code Available 2Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis Mar 25, 2025 Contrastive Learning Image-text Retrieval
Code Code Available 2MedCLIP: Contrastive Learning from Unpaired Medical Images and Text Oct 18, 2022 Contrastive Learning Image-text Retrieval
Code Code Available 2Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing Dec 21, 2022 Contrastive Learning Drug Design
Code Code Available 2Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Feb 11, 2021 Cross-Modal Retrieval Fine-Grained Image Classification
Code Code Available 2PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents Mar 13, 2023 image-classification Image Classification
Code Code Available 2Gramian Multimodal Representation Learning and Alignment Dec 16, 2024 Contrastive Learning Representation Learning
Code Code Available 2RemoteCLIP: A Vision Language Foundation Model for Remote Sensing Jun 19, 2023 Classification Cross-Modal Retrieval
Code Code Available 2Accelerating Transformers with Spectrum-Preserving Token Merging May 25, 2024 image-classification Image Classification
Code Code Available 2VeCLIP: Improving CLIP Training via Visual-enriched Captions Oct 11, 2023 Image-text Retrieval Retrieval
Code Code Available 2Egocentric Video-Language Pretraining Jun 3, 2022 Action Recognition Contrastive Learning
Code Code Available 2A Replication Study of Dense Passage Retriever Apr 12, 2021 Open-Domain Question Answering Question Answering
Code Code Available 2Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment Apr 28, 2024 Cross-Modal Retrieval Image Retrieval
Code Code Available 2FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation Jun 10, 2025 Image-text Retrieval Question Answering
Code Code Available 2BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Jan 13, 2025 Articles Image-text Retrieval
Code Code Available 2Dense Text Retrieval based on Pretrained Language Models: A Survey Nov 27, 2022 Retrieval Survey
Code Code Available 2BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models Apr 17, 2021 Argument Retrieval Benchmarking
Code Code Available 2Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications Oct 29, 2024 Image Retrieval RAG
Code Code Available 2Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations Apr 29, 2024 Retrieval Text Retrieval
Code Code Available 2Distillation Enhanced Generative Retrieval Feb 16, 2024 Retrieval Text Retrieval
Code Code Available 2Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval Mar 8, 2024 Image-text Retrieval Retrieval
Code Code Available 2Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion Feb 6, 2025 image-classification Image Classification
Code Code Available 2FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions Mar 22, 2024 Information Retrieval Retrieval
Code Code Available 2Frozen Transformers in Language Models Are Effective Visual Encoder Layers Oct 19, 2023 Action Recognition Image-text Retrieval
Code Code Available 2GLAP: General contrastive audio-text pretraining across domains and languages Jun 12, 2025 AudioCaps Keyword Spotting
Code Code Available 2CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment Sep 14, 2022 Retrieval Text Retrieval
Code Code Available 2AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language Models Nov 28, 2024 Audio captioning Audio to Text Retrieval
Code Code Available 2Cross-lingual and Multilingual CLIP Jun 1, 2022 Contrastive Learning Image-text Retrieval
Code Code Available 2DreamLIP: Language-Image Pre-training with Long Captions Mar 25, 2024 Contrastive Learning Image-text Retrieval
Code Code Available 2