| Learning From Noisy Correspondence With Tri-Partition for Cross-Modal Matching | Sep 22, 2023 | Cross-modal retrieval with noisy correspondenceMemorization | —Unverified | 0 |
| Dynamic Visual Semantic Sub-Embeddings and Fast Re-Ranking | Sep 15, 2023 | Image-text matchingRe-Ranking | —Unverified | 0 |
| Improving Multimodal Classification of Social Media Posts by Leveraging Image-Text Auxiliary Tasks | Sep 14, 2023 | Image-text matchingSarcasm Detection | CodeCode Available | 0 |
| Towards Better Multi-modal Keyphrase Generation via Visual Entity Enhancement and Multi-granularity Image Noise Filtering | Sep 9, 2023 | Image CaptioningImage-text matching | CodeCode Available | 0 |
| GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models' Over-Reliance on Superficial Clue | Sep 8, 2023 | Semantic SimilaritySemantic Textual Similarity | —Unverified | 0 |
| Prompt-based Effective Input Reformulation for Legal Case Retrieval | Sep 6, 2023 | RetrievalText Matching | CodeCode Available | 0 |
| 3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation | Aug 31, 2023 | NavigateReferring Expression | CodeCode Available | 1 |
| ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation | Aug 31, 2023 | Image-text matchingLanguage Modeling | —Unverified | 0 |
| Text Matching Improves Sequential Recommendation by Reducing Popularity Biases | Aug 27, 2023 | Recommendation SystemsSequential Recommendation | CodeCode Available | 1 |
| Uniformly Distributed Category Prototype-Guided Vision-Language Framework for Long-Tail Recognition | Aug 24, 2023 | AttributeImage-text matching | —Unverified | 0 |
| EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE | Aug 23, 2023 | Image-text matchingImage-text Retrieval | —Unverified | 0 |
| Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models | Aug 18, 2023 | Image-text matchingObject Localization | —Unverified | 0 |
| KETM:A Knowledge-Enhanced Text Matching method | Aug 11, 2023 | Common Sense ReasoningQuestion Answering | CodeCode Available | 1 |
| Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models | Aug 11, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| InfeRE: Step-by-Step Regex Generation via Chain of Inference | Aug 8, 2023 | Text Matching | CodeCode Available | 0 |
| Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination | Aug 8, 2023 | Image-text matchingRepresentation Learning | CodeCode Available | 1 |
| 3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment | Aug 8, 2023 | 3D Question Answering (3D-QA)Dense Captioning | CodeCode Available | 2 |
| Grounded Image Text Matching with Mismatched Relation Reasoning | Aug 2, 2023 | Image-text matchingRelation | —Unverified | 0 |
| A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models | Jul 24, 2023 | Image GenerationImage-text matching | CodeCode Available | 2 |
| Advancing Visual Grounding with Scene Knowledge: Benchmark and Method | Jul 21, 2023 | Image-text matchingText Matching | CodeCode Available | 1 |
| UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding | Jul 3, 2023 | Image-text matchingSentence | CodeCode Available | 1 |
| Improving Text Matching in E-Commerce Search with A Rationalizable, Intervenable and Fast Entity-Based Relevance Model | Jul 1, 2023 | Text Matching | —Unverified | 0 |
| Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark | Jun 5, 2023 | AttributeImage-text matching | CodeCode Available | 1 |
| Revisiting the Role of Language Priors in Vision-Language Models | Jun 2, 2023 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |
| Improved Probabilistic Image-Text Representations | May 29, 2023 | Data AugmentationImage-text matching | CodeCode Available | 1 |
| Are Diffusion Models Vision-And-Language Reasoners? | May 25, 2023 | DenoisingImage Generation | CodeCode Available | 1 |
| UniTRec: A Unified Text-to-Text Transformer and Joint Contrastive Learning Framework for Text-based Recommendation | May 25, 2023 | Contrastive LearningText Matching | CodeCode Available | 1 |
| PESCO: Prompt-enhanced Self Contrastive Learning for Zero-shot Text Classification | May 24, 2023 | ClassificationContrastive Learning | —Unverified | 0 |
| Fusion-in-T5: Unifying Document Ranking Signals for Improved Information Retrieval | May 24, 2023 | Document RankingInformation Retrieval | CodeCode Available | 0 |
| LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation | May 18, 2023 | AttributeImage Generation | CodeCode Available | 1 |
| MALM: Mask Augmentation based Local Matching for Food-Recipe Retrieval | May 18, 2023 | Image-text matchingRetrieval | CodeCode Available | 0 |
| Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners | May 18, 2023 | Image GenerationImage-text matching | CodeCode Available | 1 |
| Probing the Role of Positional Information in Vision-Language Models | May 17, 2023 | Contrastive LearningImage-text matching | —Unverified | 0 |
| Scene Text Recognition with Image-Text Matching-guided Dictionary | May 8, 2023 | Image-text matchingLanguage Modeling | —Unverified | 0 |
| Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations | May 6, 2023 | Image-text matchingText Matching | CodeCode Available | 1 |
| Vision Meets Definitions: Unsupervised Visual Word Sense Disambiguation Incorporating Gloss Information | May 2, 2023 | Bayesian InferenceImage-text matching | CodeCode Available | 0 |
| RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models | Apr 21, 2023 | Cross-Modal RetrievalImage-text matching | CodeCode Available | 0 |
| Integrity and Junkiness Failure Handling for Embedding-based Retrieval: A Case Study in Social Network Search | Apr 18, 2023 | RetrievalText Matching | —Unverified | 0 |
| Verbs in Action: Improving verb understanding in video-language models | Apr 13, 2023 | Contrastive LearningQuestion Answering | CodeCode Available | 0 |
| The Short Text Matching Model Enhanced with Knowledge via Contrastive Learning | Apr 8, 2023 | Contrastive LearningSentence | —Unverified | 0 |
| Multi-Modal Representation Learning with Text-Driven Soft Masks | Apr 3, 2023 | Contrastive LearningData Augmentation | —Unverified | 0 |
| Probabilistic Prompt Learning for Dense Prediction | Apr 3, 2023 | AttributePrediction | —Unverified | 0 |
| A Measurement-Based Quantum-Like Language Model for Text Matching | Apr 1, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Multimodal Image-Text Matching Improves Retrieval-based Chest X-Ray Report Generation | Mar 29, 2023 | Image CaptioningImage-text matching | CodeCode Available | 1 |
| Integrating Language Guidance Into Image-Text Matching for Correcting False Negatives | Mar 24, 2023 | Cross-modal retrieval with noisy correspondenceImage-text matching | CodeCode Available | 0 |
| Plug-and-Play Regulators for Image-Text Matching | Mar 23, 2023 | Cross-Modal RetrievalImage Retrieval | CodeCode Available | 1 |
| Increasing Textual Context Size Boosts Medical Image-Text Matching | Mar 23, 2023 | Image-text matchingText Matching | CodeCode Available | 0 |
| BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency | Mar 22, 2023 | Cross-modal retrieval with noisy correspondenceImage-text matching | CodeCode Available | 1 |
| Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval | Mar 22, 2023 | Image-text matchingLanguage Modeling | CodeCode Available | 2 |
| Refined Vision-Language Modeling for Fine-grained Multi-modal Pre-training | Mar 9, 2023 | Image-text matchingLanguage Modeling | —Unverified | 0 |