Med-gte-hybrid: A contextual embedding transformer model for extracting actionable information from clinical texts Feb 21, 2025 Contrastive Learning Decision Making
— Unverified 0SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Feb 20, 2025 Fairness Image-text Retrieval
Code Code Available 0ATRI: Mitigating Multilingual Audio Text Retrieval Inconsistencies by Reducing Data Distribution Errors Feb 20, 2025 AudioCaps Contrastive Learning
Code Code Available 0PeerQA: A Scientific Question Answering Dataset from Peer Reviews Feb 19, 2025 answerability prediction Answer Generation
Code Code Available 1LSTM-based Selective Dense Text Retrieval Guided by Sparse Lexical Retrieval Feb 15, 2025 Retrieval Text Retrieval
— Unverified 0Fine-tuning Multimodal Transformers on Edge: A Parallel Split Learning Approach Feb 10, 2025 Federated Learning Image-text Retrieval
— Unverified 0Temporal Working Memory: Query-Guided Segment Refinement for Enhanced Multimodal Understanding Feb 9, 2025 Image Captioning Image-text Retrieval
Code Code Available 3DCFormer: Efficient 3D Vision-Language Modeling with Decomposed Convolutions Feb 7, 2025 Anomaly Detection Image-text Retrieval
— Unverified 0Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion Feb 6, 2025 image-classification Image Classification
Code Code Available 2Expertized Caption Auto-Enhancement for Video-Text Retrieval Feb 5, 2025 Caption Generation Retrieval
Code Code Available 0Scientometric Analysis of the German IR Community within TREC & CLEF Feb 5, 2025 Information Retrieval Retrieval
— Unverified 0Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes Jan 23, 2025 Emotion Classification Image Captioning
Code Code Available 0A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models Jan 21, 2025 RAG Retrieval
Code Code Available 7MASS: Overcoming Language Bias in Image-Text Matching Jan 20, 2025 Image-text matching Image-text Retrieval
— Unverified 0TSVC:Tripartite Learning with Semantic Variation Consistency for Robust Image-Text Retrieval Jan 19, 2025 Cross-Modal Retrieval Image-text Retrieval
— Unverified 0BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Jan 13, 2025 Articles Image-text Retrieval
Code Code Available 2Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training Jan 1, 2025 Image-text Retrieval Image to text
— Unverified 0V^2Dial: Unification of Video and Visual Dialog via Multimodal Experts Jan 1, 2025 Contrastive Learning Text Retrieval
— Unverified 0Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation Jan 1, 2025 image-classification Image Classification
— Unverified 0CLIP is Almost All You Need: Towards Parameter-Efficient Scene Text Retrieval without OCR Jan 1, 2025 All Optical Character Recognition
— Unverified 0Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment Jan 1, 2025 Relation Retrieval
— Unverified 0CaReBench: A Fine-Grained Benchmark for Video Captioning and Retrieval Dec 31, 2024 Retrieval Text Retrieval
— Unverified 0The Text Classification Pipeline: Starting Shallow going Deeper Dec 30, 2024 Classification Information Retrieval
— Unverified 0GASLITEing the Retrieval: Exploring Vulnerabilities in Dense Embedding-based Search Dec 30, 2024 RAG Retrieval
Code Code Available 1Optimizing Multi-Stage Language Models for Effective Text Retrieval Dec 26, 2024 Retrieval Text Retrieval
— Unverified 0Multi-Head Attention Driven Dynamic Visual-Semantic Embedding for Enhanced Image-Text Matching Dec 26, 2024 Image-text matching Text Matching
— Unverified 0Reversed in Time: A Novel Temporal-Emphasized Benchmark for Cross-Modal Video-Text Retrieval Dec 26, 2024 Image-text Retrieval Information Retrieval
Code Code Available 0Where am I? Cross-View Geo-localization with Natural Language Descriptions Dec 22, 2024 geo-localization Image Retrieval
Code Code Available 2PolySmart @ TRECVid 2024 Medical Video Question Answering Dec 20, 2024 Question Answering Retrieval
— Unverified 0SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval Dec 19, 2024 Knowledge Graphs RAG
— Unverified 0Multimodal Hypothetical Summary for Retrieval-based Multi-image Question Answering Dec 19, 2024 Contrastive Learning Language Modeling
Code Code Available 0I0T: Embedding Standardization Method Towards Zero Modality Gap Dec 18, 2024 Contrastive Learning Image-text Retrieval
Code Code Available 1CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval Dec 17, 2024 Contrastive Learning Information Retrieval
Code Code Available 1Establishing a Foundation for Tetun Ad-Hoc Text Retrieval: Stemming, Indexing, Retrieval, and Ranking Dec 16, 2024 Information Retrieval Retrieval
— Unverified 0Gramian Multimodal Representation Learning and Alignment Dec 16, 2024 Contrastive Learning Representation Learning
Code Code Available 2jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images Dec 11, 2024 Contrastive Learning Cross-Modal Information Retrieval
— Unverified 0Barking Up The Syntactic Tree: Enhancing VLM Training with Syntactic Losses Dec 11, 2024 Image-text Retrieval Question Answering
— Unverified 0Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning Dec 10, 2024 Contrastive Learning Image-text Retrieval
— Unverified 0VladVA: Discriminative Fine-tuning of LVLMs Dec 5, 2024 Image-text Retrieval Representation Learning
— Unverified 0Linq-Embed-Mistral Technical Report Dec 4, 2024 Retrieval Text Retrieval
— Unverified 0Adaptive Two-Phase Finetuning LLMs for Japanese Legal Text Retrieval Dec 3, 2024 Retrieval Text Retrieval
— Unverified 0DIR: Retrieval-Augmented Image Captioning with Comprehensive Understanding Dec 2, 2024 Caption Generation Domain Generalization
— Unverified 0Approximate Fiber Product: A Preliminary Algebraic-Geometric Perspective on Multimodal Embedding Alignment Nov 30, 2024 Image-text Retrieval Representation Learning
— Unverified 0CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives Nov 29, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 0AudioSetCaps: An Enriched Audio-Caption Dataset using Automated Generation Pipeline with Large Audio and Language Models Nov 28, 2024 Audio captioning Audio to Text Retrieval
Code Code Available 2Knowledge Transfer Across Modalities with Natural Language Supervision Nov 23, 2024 Image-text Retrieval Novel Concepts
— Unverified 0Cross-Modal Pre-Aligned Method with Global and Local Information for Remote-Sensing Image and Text Retrieval Nov 22, 2024 Image Retrieval Reranking
— Unverified 0Uni-Mlip: Unified Self-supervision for Medical Vision Language Pre-training Nov 20, 2024 Contrastive Learning image-classification
— Unverified 0A Comparative Study of Text Retrieval Models on DaReCzech Nov 19, 2024 Information Retrieval Machine Translation
— Unverified 0A Survey of Medical Vision-and-Language Applications and Their Techniques Nov 19, 2024 Decision Making Diagnostic
Code Code Available 1