SOTAVerified

Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Showing 401425 of 671 papers

TitleStatusHype
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval0
Learning Context-Adapted Video-Text Retrieval by Attending to User Comments0
Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval0
Learning Multi-Modal Nonlinear Embeddings: Performance Bounds and an Algorithm0
Learning to embed semantic similarity for joint image-text retrieval0
Learning with Noisy Correspondence0
Lecture Presentations Multimodal Dataset: Towards Understanding Multimodality in Educational Videos0
Leveraging Generative Language Models for Weakly Supervised Sentence Component Analysis in Video-Language Joint Learning0
Multimodal Adversarial Defense for Vision-Language Models by Leveraging One-To-Many Relationships0
Lifelong learning for text retrieval and recognition in historical handwritten document collections0
LightCLIP: Learning Multi-Level Interaction for Lightweight Vision-Language Models0
Linq-Embed-Mistral Technical Report0
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning0
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models0
LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval0
LSTM-based Selective Dense Text Retrieval Guided by Sparse Lexical Retrieval0
LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for Remote Sensing Image-Text Retrival0
LV-MAE: Learning Long Video Representations through Masked-Embedding Autoencoders0
M2D2: Exploring General-purpose Audio-Language Representations Beyond CLAP0
Mamba Retriever: Utilizing Mamba for Effective and Efficient Dense Retrieval0
MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning0
Masked Contrastive Pre-Training for Efficient Video-Text Retrieval0
Mask to reconstruct: Cooperative Semantics Completion for Video-text Retrieval0
MASS: Overcoming Language Bias in Image-Text Matching0
Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval0
Show:102550
← PrevPage 17 of 27Next →

No leaderboard results yet.