SOTAVerified

Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Showing 151175 of 671 papers

TitleStatusHype
MV-Adapter: Multimodal Video Transfer Learning for Video Text RetrievalCode1
Learning Semantic Relationship Among Instances for Image-Text MatchingCode1
LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse RetrievalCode1
Fine-Grained Image-Text Matching by Cross-Modal Hard Aligning NetworkCode1
Benchmarking Robustness of Multimodal Image-Text Models under Distribution ShiftCode1
FlexiViT: One Model for All Patch SizesCode1
DialogCC: An Automated Pipeline for Creating High-Quality Multi-Modal Dialogue DatasetCode1
ComCLIP: Training-Free Compositional Image and Text MatchingCode1
Seeing What You Miss: Vision-Language Pre-training with Semantic Completion LearningCode1
COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust LearningCode1
VTC: Improving Video-Text Retrieval with User CommentsCode1
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training ModelCode1
Mixed-modality Representation Learning and Pre-training for Joint Table-and-Text Retrieval in OpenQACode1
Nonparametric Decoding for Generative RetrievalCode1
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language ModelCode1
DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge BasesCode1
Audio Retrieval with WavText5K and CLAP TrainingCode1
Mr. Right: Multimodal Retrieval on Representation of ImaGe witH TextCode1
FETA: Towards Specializing Foundation Models for Expert Task ApplicationsCode1
Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal RetrievalCode1
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical AlignmentCode1
Contrastive Audio-Language Learning for MusicCode1
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text RetrievalCode1
A Dense Representation Framework for Lexical and Semantic MatchingCode1
MixGen: A New Multi-Modal Data AugmentationCode1
Show:102550
← PrevPage 7 of 27Next →

No leaderboard results yet.