UrbanCross: Enhancing Satellite Image-Text Retrieval with Cross-Domain Adaptation Apr 22, 2024 Diversity Domain Adaptation
— Unverified 0MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction Apr 19, 2024 Image Reconstruction Text Retrieval
— Unverified 0FecTek: Enhancing Term Weight in Lexicon-Based Retrieval with Feature Context and Term-level Knowledge Apr 18, 2024 Contrastive Learning Retrieval
— Unverified 0TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model Apr 14, 2024 Language Modeling Language Modelling
— Unverified 0Learning with Noisy Correspondence Apr 13, 2024 Cross-Modal Retrieval Cross-modal retrieval with noisy correspondence
— Unverified 0HaVTR: Improving Video-Text Retrieval Through Augmentation Using Large Foundation Models Apr 7, 2024 Hallucination Representation Learning
— Unverified 0Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement Apr 6, 2024 Image-text Retrieval object-detection
— Unverified 0M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models Mar 31, 2024 Image-text Retrieval Language Modeling
Code Code Available 3Shallow Cross-Encoders for Low-Latency Retrieval Mar 29, 2024 CPU GPU
Code Code Available 0ArabicaQA: A Comprehensive Dataset for Arabic Question Answering Mar 26, 2024 Benchmarking Machine Reading Comprehension
Code Code Available 1Denoising Table-Text Retrieval for Open-Domain Question Answering Mar 26, 2024 Denoising Open-Domain Question Answering
Code Code Available 0DreamLIP: Language-Image Pre-training with Long Captions Mar 25, 2024 Contrastive Learning Image-text Retrieval
Code Code Available 2Improving Retrieval for RAG based Question Answering Models on Financial Documents Mar 23, 2024 Chunking Question Answering
— Unverified 0FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions Mar 22, 2024 Information Retrieval Retrieval
Code Code Available 2vid-TLDR: Training Free Token merging for Light-weight Video Transformer Mar 20, 2024 Action Recognition Computational Efficiency
Code Code Available 2Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning Mar 19, 2024 Diagnostic image-classification
Code Code Available 1Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory Mar 19, 2024 Adversarial Text Diversity
Code Code Available 1LuoJiaHOG: A Hierarchy Oriented Geo-aware Image Caption Dataset for Remote Sensing Image-Text Retrival Mar 16, 2024 Caption Generation Image-text Retrieval
— Unverified 0Improving Adversarial Transferability of Vision-Language Pre-training Models through Collaborative Multimodal Interaction Mar 16, 2024 Adversarial Robustness Image-text Retrieval
— Unverified 0Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval Mar 16, 2024 Image Retrieval Retrieval
— Unverified 0Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text Retrieval Mar 15, 2024 AudioCaps Contrastive Learning
— Unverified 0Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval Mar 8, 2024 Image-text Retrieval Retrieval
Code Code Available 2CLIP the Bias: How Useful is Balancing Data in Multimodal Learning? Mar 7, 2024 Image to text Image-to-Text Retrieval
— Unverified 0Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Feb 29, 2024 Retrieval Text Retrieval
Code Code Available 4Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control Feb 27, 2024 GPU Image Retrieval
Code Code Available 1Unifying Latent and Lexicon Representations for Effective Video-Text Retrieval Feb 26, 2024 Retrieval Text Retrieval
— Unverified 0Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings Feb 26, 2024 Contrastive Learning Multi-Task Learning
— Unverified 0MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning Feb 21, 2024 Retrieval Text Generation
— Unverified 0PIRB: A Comprehensive Benchmark of Polish Dense and Hybrid Text Retrieval Methods Feb 20, 2024 Information Retrieval Knowledge Distillation
— Unverified 0LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration Feb 18, 2024 Multi-hop Question Answering Question Answering
Code Code Available 1Distillation Enhanced Generative Retrieval Feb 16, 2024 Retrieval Text Retrieval
Code Code Available 2Multimodal Learned Sparse Retrieval for Image Suggestion Feb 12, 2024 Image Captioning Retrieval
— Unverified 0Video Editing for Video Retrieval Feb 4, 2024 Retrieval Text Retrieval
— Unverified 0M2-RAAP: A Multi-Modal Recipe for Advancing Adaptation-based Pre-training towards Effective and Efficient Zero-shot Video-text Retrieval Jan 31, 2024 Retrieval Text Retrieval
Code Code Available 2Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning Jan 30, 2024 Diversity Image-text Retrieval
Code Code Available 0Towards 3D Molecule-Text Interpretation in Language Models Jan 25, 2024 Instruction Following Language Modeling
Code Code Available 2Enhancing Image-Text Matching with Adaptive Feature Aggregation Jan 18, 2024 Image-text matching Image-text Retrieval
Code Code Available 0SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment Jan 4, 2024 Image Captioning image-classification
— Unverified 0BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving Jan 2, 2024 Autonomous Driving Caption Generation
— Unverified 0Accept the Modality Gap: An Exploration in the Hyperbolic Space Jan 1, 2024 Image to text Image-to-Text Retrieval
— Unverified 0OTE: Exploring Accurate Scene Text Recognition Using One Token Jan 1, 2024 Decoder Scene Text Recognition
Code Code Available 0Building Vision-Language Models on Solid Foundations with Masked Distillation Jan 1, 2024 Contrastive Learning Knowledge Distillation
— Unverified 0Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization Dec 30, 2023 Answer Generation Contrastive Learning
Code Code Available 1InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks Dec 21, 2023 Image Retrieval Image-to-Text Retrieval
Code Code Available 1ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval Dec 19, 2023 Few-Shot Learning Retrieval
Code Code Available 1Data-Efficient Multimodal Fusion on a Single GPU Dec 15, 2023 GPU Image Retrieval
Code Code Available 1Filter & Align: Leveraging Human Knowledge to Curate Image-Text Data Dec 11, 2023 Image Captioning Image-text Retrieval
— Unverified 0RGNet: A Unified Clip Retrieval and Grounding Network for Long Videos Dec 11, 2023 Natural Language Moment Retrieval Natural Language Queries
Code Code Available 1Leveraging Generative Language Models for Weakly Supervised Sentence Component Analysis in Video-Language Joint Learning Dec 10, 2023 Language Modeling Language Modelling
— Unverified 0Predictive Chemistry Augmented with Text Retrieval Dec 8, 2023 molecular representation Retrieval
Code Code Available 1