SOTAVerified

Text Matching

Matching a target text to a source text based on their meaning.

Papers

Showing 51100 of 364 papers

TitleStatusHype
Advanced Multimodal Deep Learning Architecture for Image-Text Matching0
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching0
Robust Interaction-Based Relevance Modeling for Online e-Commerce SearchCode0
FactGenius: Combining Zero-Shot Prompting and Fuzzy Relation Mining to Improve Fact Verification with Knowledge GraphsCode0
Hybrid-Learning Video Moment Retrieval across Multi-Domain Labels0
LLMs and Memorization: On Quality and Specificity of Copyright ComplianceCode0
DEMO: A Statistical Perspective for Efficient Image-Text Matching0
Revisiting Deep Audio-Text Retrieval Through the Lens of TransportationCode1
Content-Based Image Retrieval for Multi-Class Volumetric Radiology Images: A Benchmark Study0
CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question Answering0
RETTA: Retrieval-Enhanced Test-Time Adaptation for Zero-Shot Video Captioning0
COM3D: Leveraging Cross-View Correspondence and Cross-Modal Mining for 3D Retrieval0
Breaking Through the Noisy Correspondence: A Robust Model for Image-Text Matching0
Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text MatchingCode1
Modeling Selective Feature Attention for Representation-based Siamese Text MatchingCode0
FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction0
Narrative Action Evaluation with Prompt-Guided Multimodal InteractionCode1
Do You Remember? Dense Video Captioning with Cross-Modal Memory RetrievalCode2
Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems0
SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining0
Constructing Multilingual Visual-Text Datasets Revealing Visual Multilingual Ability of Vision Language Models0
Are LLMs Effective Backbones for Fine-tuning? An Experimental Investigation of Supervised LLMs on Chinese Short Text Matching0
FSMR: A Feature Swapping Multi-modal Reasoning Approach with Joint Textual and Visual Clues0
PointCloud-Text Matching: Benchmark Datasets and a Baseline0
RadCLIP: Enhancing Radiologic Image Analysis through Contrastive Language-Image Pre-trainingCode1
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal DatasetsCode0
Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach for Relation Classification0
Image-Text Matching with Multi-View Attention0
Multi-Intent Attribute-Aware Text Matching in Searching0
ColorSwap: A Color and Word Order Dataset for Multimodal EvaluationCode1
MouSi: Poly-Visual-Expert Vision-Language ModelsCode2
Beyond Image-Text Matching: Verb Understanding in Multimodal Transformers Using Guided MaskingCode0
Enhancing Image-Text Matching with Adaptive Feature AggregationCode0
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition0
Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech0
CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image InputsCode0
SRTube: Video-Language Pre-Training with Action-Centric Video Tube Features and Semantic Role Labeling0
Backdoor Attack on Unpaired Medical Image-Text Foundation Models: A Pilot Study on MedCLIPCode0
A Dual-way Enhanced Framework from Text Matching Point of View for Multimodal Entity LinkingCode0
OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization0
CILF-CIAE: CLIP-driven Image-Language Fusion for Correcting Inverse Age Estimation0
Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language ModelsCode1
Tracing Influence at Scale: A Contrastive Learning Approach to Linking Public Comments and Regulator Responses0
MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction ExpertsCode1
Active Mining Sample Pair Semantics for Image-text Matching0
A New Fine-grained Alignment Method for Image-text Matching0
Text Augmented Spatial-aware Zero-shot Referring Image Segmentation0
Cross-modal Active Complementary Learning with Self-refining CorrespondenceCode1
CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting0
Learning Comprehensive Representations with Richer Self for Text-to-Image Person Re-Identification0
Show:102550
← PrevPage 2 of 8Next →

No leaderboard results yet.