SOTAVerified

Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Showing 101125 of 671 papers

TitleStatusHype
CLIP2Video: Mastering Video-Text Retrieval via Image CLIPCode1
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip RetrievalCode1
Extending Multi-modal Contrastive RepresentationsCode1
FETA: Towards Specializing Foundation Models for Expert Task ApplicationsCode1
FuseCap: Leveraging Large Language Models for Enriched Fused Image CaptionsCode1
Fine-Grained Image-Text Matching by Cross-Modal Hard Aligning NetworkCode1
Bridging Video-text Retrieval with Multiple Choice QuestionsCode1
Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents IntegrationCode1
Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense CaptionerCode1
COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust LearningCode1
Equivariant Similarity for Vision-Language Foundation ModelsCode1
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical AlignmentCode1
ComCLIP: Training-Free Compositional Image and Text MatchingCode1
COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language BenchmarkCode1
A Deep Local and Global Scene-Graph Matching for Image-Text RetrievalCode1
Composing Object Relations and Attributes for Image-Text MatchingCode1
Helping Hands: An Object-Aware Ego-Centric Video Recognition ModelCode1
Consensus-Aware Visual-Semantic Embedding for Image-Text MatchingCode1
A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and ReportsCode1
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language TransformersCode1
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax LossCode1
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text RetrievalCode1
Nonparametric Decoding for Generative RetrievalCode1
Audio Retrieval with Natural Language Queries: A Benchmark StudyCode1
ESA: External Space Attention Aggregation for Image-Text RetrievalCode1
Show:102550
← PrevPage 5 of 27Next →

No leaderboard results yet.