Scaling Transformer to 1M tokens and beyond with RMT Apr 19, 2023 Language Modeling Language Modelling
Code Code Available 2VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset Apr 17, 2023 Audio captioning Audio-Video Question Answering (AVQA)
Code Code Available 2Unicom: Universal and Compact Representation Learning for Image Retrieval Apr 12, 2023 Image Classification Image Retrieval
Code Code Available 2ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model Apr 3, 2023 Denoising Diversity
Code Code Available 2Query-Dependent Video Representation for Moment Retrieval and Highlight Detection Mar 24, 2023 Highlight Detection Moment Retrieval
Code Code Available 2RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation Mar 22, 2023 Code Completion Language Modeling
Code Code Available 2Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval Mar 22, 2023 Image-text matching Language Modeling
Code Code Available 2CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition Mar 20, 2023 Retrieval Scene Understanding
Code Code Available 2PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents Mar 13, 2023 image-classification Image Classification
Code Code Available 2OpenICL: An Open-Source Framework for In-context Learning Mar 6, 2023 In-Context Learning Language Modeling
Code Code Available 2UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers Mar 1, 2023 Domain Adaptation Information Retrieval
Code Code Available 2BEVPlace: Learning LiDAR-based Place Recognition using Bird's Eye View Images Feb 28, 2023 Retrieval
Code Code Available 2Contour Context: Abstract Structural Distribution for 3D LiDAR Loop Detection and Metric Pose Estimation Feb 13, 2023 Loop Closure Detection Pose Estimation
Code Code Available 2In-Context Retrieval-Augmented Language Models Jan 31, 2023 Language Modeling Language Modelling
Code Code Available 2Grounding Language Models to Images for Multimodal Inputs and Outputs Jan 31, 2023 Image Retrieval In-Context Learning
Code Code Available 2PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development Jan 23, 2023 Question Answering Reading Comprehension
Code Code Available 2InPars-v2: Large Language Models as Efficient Dataset Generators for Information Retrieval Jan 4, 2023 Information Retrieval Retrieval
Code Code Available 2Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval? Dec 31, 2022 Data Augmentation Retrieval
Code Code Available 2Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing Dec 21, 2022 Contrastive Learning Drug Design
Code Code Available 2Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions Dec 20, 2022 Hallucination Question Answering
Code Code Available 2Precise Zero-Shot Dense Retrieval without Relevance Labels Dec 20, 2022 Fact Verification Instruction Following
Code Code Available 2Semantic-Conditional Diffusion Networks for Image Captioning Dec 6, 2022 Cross-Modal Retrieval Decoder
Code Code Available 2Melody transcription via generative pre-training Dec 4, 2022 Chord Recognition Information Retrieval
Code Code Available 2Dense Text Retrieval based on Pretrained Language Models: A Survey Nov 27, 2022 Retrieval Survey
Code Code Available 2Roboflow 100: A Rich, Multi-Domain Object Detection Benchmark Nov 24, 2022 2D Object Detection Image Retrieval
Code Code Available 2RetroMAE v2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models Nov 16, 2022 Dimensionality Reduction Information Retrieval
Code Code Available 2MMDialog: A Large-scale Multi-turn Dialogue Dataset Towards Multi-modal Open-domain Conversation Nov 10, 2022 Multimodal Intent Recognition Retrieval
Code Code Available 2Body Part-Based Representation Learning for Occluded Person Re-Identification Nov 7, 2022 Human Parsing Occluded Person Re-Identification
Code Code Available 2When Language Model Meets Private Library Oct 31, 2022 Code Generation Language Modeling
Code Code Available 2Retrieval Oriented Masking Pre-training Language Model for Dense Passage Retrieval Oct 27, 2022 Language Modeling Language Modelling
Code Code Available 2PoseScript: Linking 3D Human Poses and Natural Language Oct 21, 2022 Cross-Modal Retrieval Image Captioning
Code Code Available 2MuGER^2: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering Oct 19, 2022 Navigate Question Answering
Code Code Available 2MedCLIP: Contrastive Learning from Unpaired Medical Images and Text Oct 18, 2022 Contrastive Learning Image-text Retrieval
Code Code Available 2Making a MIRACL: Multilingual Information Retrieval Across a Continuum of Languages Oct 18, 2022 Information Retrieval Retrieval
Code Code Available 2Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning Oct 12, 2022 Contrastive Learning Form
Code Code Available 2Retrieval Augmented Visual Question Answering with Outside Knowledge Oct 7, 2022 Answer Generation Diagnostic
Code Code Available 2Content-Based Search for Deep Generative Models Oct 6, 2022 Contrastive Learning Image and Sketch based Model Retrieval
Code Code Available 2When and why vision-language models behave like bags-of-words, and what to do about it? Oct 4, 2022 Contrastive Learning Retrieval
Code Code Available 2Contrastive Audio-Visual Masked Autoencoder Oct 2, 2022 Audio Classification Audio Tagging
Code Code Available 2Diffusion Posterior Sampling for General Noisy Inverse Problems Sep 29, 2022 Deblurring Retrieval
Code Code Available 2Multilingual Search with Subword TF-IDF Sep 28, 2022 Information Retrieval Retrieval
Code Code Available 2CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Representation Alignment Sep 14, 2022 Retrieval Text Retrieval
Code Code Available 2Flow-Guided Transformer for Video Inpainting Aug 14, 2022 Retrieval Video Inpainting
Code Code Available 2Simplified State Space Layers for Sequence Modeling Aug 9, 2022 Computational Efficiency ListOps
Code Code Available 2Atlas: Few-shot Learning with Retrieval Augmented Language Models Aug 5, 2022 Fact Checking Few-Shot Learning
Code Code Available 2Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification Jul 19, 2022 Retrieval Transfer Learning
Code Code Available 2Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022 Jul 4, 2022 Language Modeling Language Modelling
Code Code Available 2Comprehending and Ordering Semantics for Image Captioning Jun 14, 2022 Cross-Modal Retrieval Image Captioning
Code Code Available 2Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs Jun 9, 2022 Image Captioning Image Classification
Code Code Available 2Revealing Single Frame Bias for Video-and-Language Learning Jun 7, 2022 Action Recognition Fine-grained Action Recognition
Code Code Available 2