Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 671 papers

Title	Date	Tasks	Status	Hype
A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models	Jan 21, 2025	RAGRetrieval	CodeCode Available	7
h2oGPT: Democratizing Large Language Models	Jun 13, 2023	ChatbotFairness	CodeCode Available	6
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval	Jul 16, 2024	Question AnsweringRetrieval	CodeCode Available	5
BM25S: Orders of magnitude faster lexical search via eager sparse scoring	Jul 4, 2024	Passage RetrievalRetrieval	CodeCode Available	5
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation	Jan 28, 2022	Image CaptioningImage-text matching	CodeCode Available	5
FG-CLIP: Fine-Grained Visual and Textual Alignment	May 8, 2025	Image-text Retrievalobject-detection	CodeCode Available	4
Multi-label Cluster Discrimination for Visual Representation Learning	Jul 24, 2024	Contrastive LearningImage-text Retrieval	CodeCode Available	4
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers	Feb 29, 2024	RetrievalText Retrieval	CodeCode Available	4
RETSim: Resilient and Efficient Text Similarity	Nov 28, 2023	Adversarial TextClustering	CodeCode Available	4
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment	Oct 3, 2023	Audio ClassificationContrastive Learning	CodeCode Available	4
MTEB: Massive Text Embedding Benchmark	Oct 13, 2022	BenchmarkingInformation Retrieval	CodeCode Available	4
Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers	Jul 14, 2022	RetrievalText Retrieval	CodeCode Available	4
Temporal Working Memory: Query-Guided Segment Refinement for Enhanced Multimodal Understanding	Feb 9, 2025	Image CaptioningImage-text Retrieval	CodeCode Available	3
M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models	Mar 31, 2024	Image-text RetrievalLanguage Modeling	CodeCode Available	3
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities	May 18, 2023	1 Image, 2*2 StitchiAction Classification	CodeCode Available	3
AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation	Apr 4, 2023	Cross-Modal RetrievalImage-text Retrieval	CodeCode Available	3
Vision-Language Pre-training: Basics, Recent Advances, and Future Trends	Oct 17, 2022	Few-Shot LearningImage Captioning	CodeCode Available	3
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models	Feb 8, 2022	DiagnosticImage Captioning	CodeCode Available	3
TableRAG: A Retrieval Augmented Generation Framework for Heterogeneous Document Reasoning	Jun 12, 2025	Answer GenerationChunking	CodeCode Available	2
GLAP: General contrastive audio-text pretraining across domains and languages	Jun 12, 2025	AudioCapsKeyword Spotting	CodeCode Available	2
FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation	Jun 10, 2025	Image-text RetrievalQuestion Answering	CodeCode Available	2
One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory	May 29, 2025	Contrastive LearningText Retrieval	CodeCode Available	2
Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image Analysis	Mar 25, 2025	Contrastive LearningImage-text Retrieval	CodeCode Available	2
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion	Feb 6, 2025	image-classificationImage Classification	CodeCode Available	2
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature	Jan 13, 2025	ArticlesImage-text Retrieval	CodeCode Available	2
Where am I? Cross-View Geo-localization with Natural Language Descriptions	Dec 22, 2024	geo-localizationImage Retrieval	CodeCode Available	2
Gramian Multimodal Representation Learning and Alignment	Dec 16, 2024	Contrastive LearningRepresentation Learning	CodeCode Available	2
AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language Models	Nov 28, 2024	Audio captioningAudio to Text Retrieval	CodeCode Available	2
Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial Applications	Oct 29, 2024	Image RetrievalRAG	CodeCode Available	2
Towards Vision-Language Geo-Foundation Model: A Survey	Jun 13, 2024	Earth ObservationImage Captioning	CodeCode Available	2
RWKV-CLIP: A Robust Vision-Language Representation Learner	Jun 11, 2024	Image-text RetrievalRepresentation Learning	CodeCode Available	2
Accelerating Transformers with Spectrum-Preserving Token Merging	May 25, 2024	image-classificationImage Classification	CodeCode Available	2
ProtT3: Protein-to-Text Generation for Text-based Protein Understanding	May 21, 2024	Property PredictionQuestion Answering	CodeCode Available	2
Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations	Apr 29, 2024	RetrievalText Retrieval	CodeCode Available	2
Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment	Apr 28, 2024	Cross-Modal RetrievalImage Retrieval	CodeCode Available	2
DreamLIP: Language-Image Pre-training with Long Captions	Mar 25, 2024	Contrastive LearningImage-text Retrieval	CodeCode Available	2
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions	Mar 22, 2024	Information RetrievalRetrieval	CodeCode Available	2
vid-TLDR: Training Free Token merging for Light-weight Video Transformer	Mar 20, 2024	Action RecognitionComputational Efficiency	CodeCode Available	2
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval	Mar 8, 2024	Image-text RetrievalRetrieval	CodeCode Available	2
Distillation Enhanced Generative Retrieval	Feb 16, 2024	RetrievalText Retrieval	CodeCode Available	2
M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text Retrieval	Jan 31, 2024	RetrievalText Retrieval	CodeCode Available	2
Towards 3D Molecule-Text Interpretation in Language Models	Jan 25, 2024	Instruction FollowingLanguage Modeling	CodeCode Available	2
Frozen Transformers in Language Models Are Effective Visual Encoder Layers	Oct 19, 2023	Action RecognitionImage-text Retrieval	CodeCode Available	2
VeCLIP: Improving CLIP Training via Visual-enriched Captions	Oct 11, 2023	Image-text RetrievalRetrieval	CodeCode Available	2
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World	Aug 3, 2023	AllQuestion Answering	CodeCode Available	2
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing	Jun 20, 2023	Cross-Modal RetrievalImage Retrieval	CodeCode Available	2
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing	Jun 19, 2023	ClassificationCross-Modal Retrieval	CodeCode Available	2
PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents	Mar 13, 2023	image-classificationImage Classification	CodeCode Available	2
Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing	Dec 21, 2022	Contrastive LearningDrug Design	CodeCode Available	2
Dense Text Retrieval based on Pretrained Language Models: A Survey	Nov 27, 2022	RetrievalSurvey	CodeCode Available	2

Show:10 25 50

← PrevPage 1 of 14Next →

No leaderboard results yet.