SOTAVerified

Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Showing 2650 of 671 papers

TitleStatusHype
Where am I? Cross-View Geo-localization with Natural Language DescriptionsCode2
Gramian Multimodal Representation Learning and AlignmentCode2
AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language ModelsCode2
Beyond Text: Optimizing RAG with Multimodal Inputs for Industrial ApplicationsCode2
Towards Vision-Language Geo-Foundation Model: A SurveyCode2
RWKV-CLIP: A Robust Vision-Language Representation LearnerCode2
Accelerating Transformers with Spectrum-Preserving Token MergingCode2
ProtT3: Protein-to-Text Generation for Text-based Protein UnderstandingCode2
Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse RepresentationsCode2
Efficient Remote Sensing with Harmonized Transfer Learning and Modality AlignmentCode2
DreamLIP: Language-Image Pre-training with Long CaptionsCode2
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow InstructionsCode2
vid-TLDR: Training Free Token merging for Light-weight Video TransformerCode2
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text RetrievalCode2
Distillation Enhanced Generative RetrievalCode2
M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text RetrievalCode2
Towards 3D Molecule-Text Interpretation in Language ModelsCode2
Frozen Transformers in Language Models Are Effective Visual Encoder LayersCode2
VeCLIP: Improving CLIP Training via Visual-enriched CaptionsCode2
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open WorldCode2
RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote SensingCode2
RemoteCLIP: A Vision Language Foundation Model for Remote SensingCode2
PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical DocumentsCode2
Multi-modal Molecule Structure-text Model for Text-based Retrieval and EditingCode2
Dense Text Retrieval based on Pretrained Language Models: A SurveyCode2
Show:102550
← PrevPage 2 of 27Next →

No leaderboard results yet.