SOTAVerified

Text Retrieval

Text Retrieval is the task of finding the most text result (such as an answer, paragraph, or passage) given a query (which could be a question, keywords, or any relevant text)

Papers

Showing 2650 of 671 papers

TitleStatusHype
QBD-RankedDataGen: Generating Custom Ranked Datasets for Improving Query-By-Document Search Using LLM-Reranking with Reduced Human Effort0
AGATE: Stealthy Black-box Watermarking for Multimodal Model Copyright Protection0
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs0
Towards Understanding Camera Motions in Any Video0
SemCORE: A Semantic-Enhanced Generative Cross-Modal Retrieval Framework with MLLMs0
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation0
FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations0
Bridging Queries and Tables through Entities in Table Retrieval0
LV-MAE: Learning Long Video Representations through Masked-Embedding Autoencoders0
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval0
M2D2: Exploring General-purpose Audio-Language Representations Beyond CLAPCode0
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language ModelsCode1
SeLIP: Similarity Enhanced Contrastive Language Image Pretraining for Multi-modal Head MRI0
Med3DVLM: An Efficient Vision-Language Model for 3D Medical Image AnalysisCode2
GOAL: Global-local Object Alignment LearningCode1
Anatomy-Aware Conditional Image-Text Retrieval0
Bridging Classical and Quantum String Matching: A Computational Reformulation of Bit-Parallelism0
Variance-Aware Loss Scheduling for Multimodal Alignment in Low-Data Settings0
Tailoring Table Retrieval from a Field-aware Hybrid Matching Perspective0
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning0
V^2Dial: Unification of Video and Visual Dialog via Multimodal Experts0
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations0
ABC: Achieving Better Control of Multimodal Embeddings using VLMs0
How Vital is the Jurisprudential Relevance: Law Article Intervened Legal Case Retrieval and Matching0
Progressive Local Alignment for Medical Multimodal Pre-training0
Show:102550
← PrevPage 2 of 27Next →

No leaderboard results yet.