Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 671 papers

Title	Date	Tasks	Status	Hype
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text	Oct 18, 2022	Contrastive LearningImage-text Retrieval	CodeCode Available	2
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment	Sep 14, 2022	RetrievalText Retrieval	CodeCode Available	2
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs	Jun 9, 2022	Image CaptioningImage Classification	CodeCode Available	2
Egocentric Video-Language Pretraining	Jun 3, 2022	Action RecognitionContrastive Learning	CodeCode Available	2
Cross-lingual and Multilingual CLIP	Jun 1, 2022	Contrastive LearningImage-text Retrieval	CodeCode Available	2
Vision-Language Pre-Training with Triple Contrastive Learning	Feb 21, 2022	Contrastive Learningcross-modal alignment	CodeCode Available	2
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models	Apr 17, 2021	Argument RetrievalBenchmarking	CodeCode Available	2
A Replication Study of Dense Passage Retriever	Apr 12, 2021	Open-Domain Question AnsweringQuestion Answering	CodeCode Available	2
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning	Mar 2, 2021	BIG-bench Machine LearningImage Retrieval	CodeCode Available	2
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision	Feb 11, 2021	Cross-Modal RetrievalFine-Grained Image Classification	CodeCode Available	2
DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval	Jun 10, 2025	Image CaptioningRetrieval	CodeCode Available	1
Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models	Jun 10, 2025	Contrastive LearningImage-text matching	CodeCode Available	1
LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts	May 20, 2025	Caption GenerationRetrieval	CodeCode Available	1
mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs	May 16, 2025	Information RetrievalKnowledge Graphs	CodeCode Available	1
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models	Mar 25, 2025	BenchmarkingImage Captioning	CodeCode Available	1
GOAL: Global-local Object Alignment Learning	Mar 22, 2025	DescriptiveObject	CodeCode Available	1
PeerQA: A Scientific Question Answering Dataset from Peer Reviews	Feb 19, 2025	answerability predictionAnswer Generation	CodeCode Available	1
GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search	Dec 30, 2024	RAGRetrieval	CodeCode Available	1
I0T: Embedding Standardization Method Towards Zero Modality Gap	Dec 18, 2024	Contrastive LearningImage-text Retrieval	CodeCode Available	1
CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval	Dec 17, 2024	Contrastive LearningInformation Retrieval	CodeCode Available	1
A Survey of Medical Vision-and-Language Applications and Their Techniques	Nov 19, 2024	Decision MakingDiagnostic	CodeCode Available	1
Nearest Neighbor Normalization Improves Multimodal Retrieval	Oct 31, 2024	Cross-Modal RetrievalImage Captioning	CodeCode Available	1
Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval	Oct 9, 2024	RetrievalText Retrieval	CodeCode Available	1
ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds	Sep 13, 2024	Audio ClassificationDescriptive	CodeCode Available	1
COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark	Aug 5, 2024	Dense Video CaptioningDiversity	CodeCode Available	1
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval	Aug 1, 2024	AttributeOptical Character Recognition	CodeCode Available	1
Learning Video Context as Interleaved Multimodal Sequences	Jul 31, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Video-Language Alignment via Spatio-Temporal Graph Transformer	Jul 16, 2024	Contrastive LearningQuestion Answering	CodeCode Available	1
CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation	Jul 1, 2024	Image-text RetrievalQuestion Answering	CodeCode Available	1
SignCLIP: Connecting Text and Sign Language by Contrastive Learning	Jul 1, 2024	Contrastive LearningRetrieval	CodeCode Available	1
Composing Object Relations and Attributes for Image-Text Matching	Jun 17, 2024	AttributeGraph Attention	CodeCode Available	1
Bridging Language Gaps in Audio-Text Retrieval	Jun 11, 2024	AudioCapsRetrieval	CodeCode Available	1
Transcending Fusion: A Multi-Scale Alignment Method for Remote Sensing Image-Text Retrieval	May 29, 2024	cross-modal alignmentImage-text Retrieval	CodeCode Available	1
LDMol: Text-to-Molecule Diffusion Model with Structurally Informative Latent Space	May 28, 2024	Contrastive LearningDecoder	CodeCode Available	1
Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration	May 26, 2024	Information RetrievalRetrieval	CodeCode Available	1
PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning	May 16, 2024	Image-text RetrievalRepresentation Learning	CodeCode Available	1
Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation	May 16, 2024	AudioCapsEvent Detection	CodeCode Available	1
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering	Mar 26, 2024	BenchmarkingMachine Reading Comprehension	CodeCode Available	1
Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning	Mar 19, 2024	Diagnosticimage-classification	CodeCode Available	1
Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory	Mar 19, 2024	Adversarial TextDiversity	CodeCode Available	1
Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control	Feb 27, 2024	GPUImage Retrieval	CodeCode Available	1
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration	Feb 18, 2024	Multi-hop Question AnsweringQuestion Answering	CodeCode Available	1
Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization	Dec 30, 2023	Answer GenerationContrastive Learning	CodeCode Available	1
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks	Dec 21, 2023	Image RetrievalImage-to-Text Retrieval	CodeCode Available	1
ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval	Dec 19, 2023	Few-Shot LearningRetrieval	CodeCode Available	1
Data-Efficient Multimodal Fusion on a Single GPU	Dec 15, 2023	GPUImage Retrieval	CodeCode Available	1
RGNet: A Unified Clip Retrieval and Grounding Network for Long Videos	Dec 11, 2023	Natural Language Moment RetrievalNatural Language Queries	CodeCode Available	1
Predictive Chemistry Augmented with Text Retrieval	Dec 8, 2023	molecular representationRetrieval	CodeCode Available	1
Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language Understanding	Nov 30, 2023	AttributeCompositional Zero-Shot Learning	CodeCode Available	1
MLLMs-Augmented Visual-Language Representation Learning	Nov 30, 2023	Image-text RetrievalRepresentation Learning	CodeCode Available	1

Show:10 25 50

← PrevPage 2 of 14Next →

No leaderboard results yet.