SOTAVerified

Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Showing 176200 of 671 papers

TitleStatusHype
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training ModelCode1
Cross-Modal Retrieval with Partially Mismatched PairsCode1
Cross-Modal Retrieval for Motion and Text via DopTriple LossCode1
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language ModelsCode1
GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based SearchCode1
A Comprehensive Review of the Video-to-Text ProblemCode1
Mixed-modality Representation Learning and Pre-training for Joint Table-and-Text Retrieval in OpenQACode1
Frozen in Time: A Joint Video and Image Encoder for End-to-End RetrievalCode1
FuseCap: Leveraging Large Language Models for Enriched Fused Image CaptionsCode1
Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and DataCode1
More Robust Dense Retrieval with Contrastive Dual LearningCode1
A Prior Instruction Representation Framework for Remote Sensing Image-text RetrievalCode1
Generative Multi-hop RetrievalCode1
Cross-modal Contrastive Learning for Speech TranslationCode1
Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware SamplingCode1
From Association to Generation: Text-only Captioning by Unsupervised Cross-modal MappingCode1
Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language RepresentationsCode1
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language TransformersCode1
Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive TrainingCode1
Fine-Tuning LLaMA for Multi-Stage Text RetrievalCode1
ALIP: Adaptive Language-Image Pre-training with Synthetic CaptionCode1
Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New BenchmarkCode1
MV-Adapter: Multimodal Video Transfer Learning for Video Text RetrievalCode1
CoSMo: Content-Style Modulation for Image Retrieval With Text FeedbackCode1
Fine-grained Video-Text Retrieval with Hierarchical Graph ReasoningCode1
Show:102550
← PrevPage 8 of 27Next →

No leaderboard results yet.