Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 671 papers

Title	Date	Tasks	Status	Hype	Score
A Replication Study of Dense Passage Retriever	Apr 12, 2021	Open-Domain Question AnsweringQuestion Answering	CodeCode Available	2	5
Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations	Apr 29, 2024	RetrievalText Retrieval	CodeCode Available	2	5
Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing	Dec 21, 2022	Contrastive LearningDrug Design	CodeCode Available	2	5
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment	Sep 14, 2022	RetrievalText Retrieval	CodeCode Available	2	5
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text	Oct 18, 2022	Contrastive LearningImage-text Retrieval	CodeCode Available	2	5
M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text Retrieval	Jan 31, 2024	RetrievalText Retrieval	CodeCode Available	2	5
GLAP: General contrastive audio-text pretraining across domains and languages	Jun 12, 2025	AudioCapsKeyword Spotting	CodeCode Available	2	5
AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language Models	Nov 28, 2024	Audio captioningAudio to Text Retrieval	CodeCode Available	2	5
Gramian Multimodal Representation Learning and Alignment	Dec 16, 2024	Contrastive LearningRepresentation Learning	CodeCode Available	2	5
One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory	May 29, 2025	Contrastive LearningText Retrieval	CodeCode Available	2	5
Audio Retrieval with WavText5K and CLAP Training	Sep 28, 2022	AudioCapsAudio captioning	CodeCode Available	1	5
Audio Retrieval with Natural Language Queries: A Benchmark Study	Dec 17, 2021	AudioCapsAudio captioning	CodeCode Available	1	5
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation	Jul 16, 2021	Cross-Modal RetrievalGrounded language learning	CodeCode Available	1	5
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports	Sep 3, 2020	Image-text RetrievalMedical Visual Question Answering	CodeCode Available	1	5
Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training	Jun 15, 2023	Image-text RetrievalRepresentation Learning	CodeCode Available	1	5
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment	Aug 29, 2022	cross-modal alignmentImage-text Retrieval	CodeCode Available	1	5
COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark	Aug 5, 2024	Dense Video CaptioningDiversity	CodeCode Available	1	5
A Survey of Medical Vision-and-Language Applications and Their Techniques	Nov 19, 2024	Decision MakingDiagnostic	CodeCode Available	1	5
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models	Jun 10, 2025	Contrastive LearningImage-text matching	CodeCode Available	1	5
From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping	Apr 26, 2023	DecoderImage Captioning	CodeCode Available	1	5
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval	Apr 1, 2021	RetrievalText Retrieval	CodeCode Available	1	5
Dynamic Modality Interaction Modeling for Image-Text Retrieval	Jul 11, 2021	cross-modal alignmentCross-Modal Retrieval	CodeCode Available	1	5
FlexiViT: One Model for All Patch Sizes	Dec 15, 2022	AllImage-text Retrieval	CodeCode Available	1	5
Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift	Dec 15, 2022	BenchmarkingImage Captioning	CodeCode Available	1	5
mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs	May 16, 2025	Information RetrievalKnowledge Graphs	CodeCode Available	1	5
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning	Mar 1, 2020	Cross-Modal RetrievalRetrieval	CodeCode Available	1	5
Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling	Apr 14, 2021	GPURe-Ranking	CodeCode Available	1	5
DialogCC: An Automated Pipeline for Creating High-Quality Multi-Modal Dialogue Dataset	Dec 8, 2022	DiversityImage Description	CodeCode Available	1	5
AdvCLIP: Downstream-agnostic Adversarial Examples in Multimodal Contrastive Learning	Aug 14, 2023	Contrastive LearningGenerative Adversarial Network	CodeCode Available	1	5
DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval	Jun 10, 2025	Image CaptioningRetrieval	CodeCode Available	1	5
Fine-Tuning LLaMA for Multi-Stage Text Retrieval	Oct 12, 2023	Passage RetrievalRetrieval	CodeCode Available	1	5
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval	Aug 1, 2024	AttributeOptical Character Recognition	CodeCode Available	1	5
Dense Hierarchical Retrieval for Open-Domain Question Answering	Oct 28, 2021	Open-Domain Question AnsweringQuestion Answering	CodeCode Available	1	5
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval	Jul 1, 2020	Contrastive LearningPassage Retrieval	CodeCode Available	1	5
FETA: Towards Specializing Foundation Models for Expert Task Applications	Sep 8, 2022	Domain GeneralizationFew-Shot Learning	CodeCode Available	1	5
A Comprehensive Review of the Video-to-Text Problem	Mar 27, 2021	Question AnsweringRetrieval	CodeCode Available	1	5
DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases	Sep 30, 2022	Entity LinkingQuestion Answering	CodeCode Available	1	5
Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval	Oct 9, 2024	RetrievalText Retrieval	CodeCode Available	1	5
Bridging Language Gaps in Audio-Text Retrieval	Jun 11, 2024	AudioCapsRetrieval	CodeCode Available	1	5
Data-Efficient Multimodal Fusion on a Single GPU	Dec 15, 2023	GPUImage Retrieval	CodeCode Available	1	5
Bridging Video-text Retrieval with Multiple Choice Questions	Jan 13, 2022	Action RecognitionLinear evaluation	CodeCode Available	1	5
Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data	Oct 8, 2023	Action RecognitionContinual Learning	CodeCode Available	1	5
A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval	Oct 27, 2023	Cross-Modal RetrievalImage-text Retrieval	CodeCode Available	1	5
A Dense Representation Framework for Lexical and Semantic Matching	Jun 20, 2022	RetrievalSemantic Text Matching	CodeCode Available	1	5
Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training	Jun 1, 2022	Contrastive LearningCross-Lingual Transfer	CodeCode Available	1	5
CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation	Jul 1, 2024	Image-text RetrievalQuestion Answering	CodeCode Available	1	5
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering	Mar 26, 2024	BenchmarkingMachine Reading Comprehension	CodeCode Available	1	5
Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval	Oct 11, 2019	Graph MatchingImage-text Retrieval	CodeCode Available	1	5
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval	Dec 17, 2024	Contrastive LearningInformation Retrieval	CodeCode Available	1	5
A Deep Local and Global Scene-Graph Matching for Image-Text Retrieval	Jun 4, 2021	Graph MatchingImage Retrieval	CodeCode Available	1	5

Show:10 25 50

← PrevPage 2 of 14Next →

No leaderboard results yet.