CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models Sep 24, 2021 Visual Grounding
Code Code Available 1Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation Sep 17, 2021 Dialogue Generation Visual Grounding
Code Code Available 1Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering Sep 13, 2021 Data Augmentation Question Answering
Code Code Available 0Panoptic Narrative Grounding Sep 10, 2021 Natural Language Visual Grounding Panoptic Segmentation
Code Code Available 1Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models Sep 8, 2021 Concept-To-Text Generation Specificity
— Unverified 0INVIGORATE: Interactive Visual Grounding and Grasping in Clutter Aug 25, 2021 Blocking Object
— Unverified 0A Better Loss for Visual-Textual Grounding Aug 11, 2021 Sentence Visual Grounding
Code Code Available 0TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding Aug 5, 2021 3D visual grounding Relation
— Unverified 0Attending Self-Attention: A Case Study of Visually Grounded Supervision in Vision-and-Language Transformers Aug 1, 2021 Language Modeling Language Modelling
— Unverified 0Word2Pix: Word to Pixel Cross Attention Transformer in Visual Grounding Jul 31, 2021 Decoder Sentence
— Unverified 0LanguageRefer: Spatial-Language Model for 3D Visual Grounding Jul 7, 2021 3D visual grounding Language Modeling
— Unverified 0VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer Jul 6, 2021 Image Retrieval Knowledge Distillation
Code Code Available 1Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs Jun 28, 2021 Question Answering Task 2
— Unverified 0AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training Jun 19, 2021 Visual Grounding
— Unverified 0Semantic sentence similarity: size does not always matter Jun 16, 2021 Grounded language learning Image Retrieval
— Unverified 0Attention-Based Keyword Localisation in Speech using Visual Grounding Jun 16, 2021 Visual Grounding
— Unverified 0Referring Transformer: A One-step Approach to Multi-task Visual Grounding Jun 6, 2021 Decoder Referring Expression
Code Code Available 1Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation May 24, 2021 Referring Expression Referring Expression Comprehension
Code Code Available 0SAT: 2D Semantics Assisted Training for 3D Visual Grounding May 24, 2021 3D visual grounding Object
Code Code Available 1Connecting What to Say With Where to Look by Modeling Human Attention Traces May 12, 2021 Caption Generation Image Captioning
Code Code Available 1MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding Apr 26, 2021 Generalized Referring Expression Comprehension Phrase Grounding
Code Code Available 1TransVG: End-to-End Visual Grounding with Transformers Apr 17, 2021 Referring Expression Comprehension Visual Grounding
Code Code Available 1Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding Apr 9, 2021 Descriptive Object
Code Code Available 1Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation Apr 5, 2021 Object Visual Grounding
Code Code Available 1Visual Grounding Strategies for Text-Only Natural Language Processing Mar 25, 2021 Image Retrieval Language Modeling
— Unverified 0Relation-aware Instance Refinement for Weakly Supervised Visual Grounding Mar 24, 2021 Object Relation
Code Code Available 1Scene-Intuitive Agent for Remote Embodied Visual Grounding Mar 24, 2021 cross-modal alignment Navigate
— Unverified 0Decoupled Spatial Temporal Graphs for Generic Visual Grounding Mar 18, 2021 Contrastive Learning Visual Grounding
— Unverified 0Few-Shot Visual Grounding for Natural Human-Robot Interaction Mar 17, 2021 Visual Grounding
— Unverified 0Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images Mar 14, 2021 3D visual grounding Object
Code Code Available 1OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene Grounding Mar 13, 2021 Referring Expression Referring Expression Segmentation
Code Code Available 1InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring Mar 1, 2021 3D visual grounding Attribute
Code Code Available 1Composing Pick-and-Place Tasks By Grounding Language Feb 16, 2021 Natural Language Visual Grounding Robotic Grasping
Code Code Available 0Answer Questions with Right Image Regions: A Visual Attention Regularization Approach Feb 3, 2021 Question Answering Visual Grounding
Code Code Available 0Transformers in Vision: A Survey Jan 4, 2021 Action Recognition Activity Recognition
— Unverified 03DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds Jan 1, 2021 Object Object Proposal Generation
— Unverified 0Explainable Video Entailment With Grounded Visual Evidence Jan 1, 2021 Visual Grounding
— Unverified 0Panoptic Narrative Grounding Jan 1, 2021 Natural Language Visual Grounding Panoptic Segmentation
Code Code Available 1Text-Free Image-to-Speech Synthesis Using Learned Segmental Units Dec 31, 2020 Image Captioning Speech Synthesis
Code Code Available 1CASTing Your Model: Learning to Localize Improves Self-Supervised Representations Dec 8, 2020 Self-Supervised Learning Visual Grounding
— Unverified 0Class-agnostic Object Detection Nov 28, 2020 Benchmarking Class-agnostic Object Detection
— Unverified 0Text-to-Image Generation Grounded by Fine-Grained User Attention Nov 7, 2020 Image Generation Position
Code Code Available 1Learning to ground medical text in a 3D human atlas Nov 1, 2020 Phrase Grounding Visual Grounding
Code Code Available 0SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency Oct 20, 2020 Question Answering Visual Grounding
Code Code Available 0Neural Twins Talk Sep 26, 2020 Image Captioning Sentence
Code Code Available 0X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers Sep 23, 2020 Image Captioning Image Generation
Code Code Available 1Commands 4 Autonomous Vehicles (C4AV) Workshop Summary Sep 18, 2020 Autonomous Vehicles Referring Expression Comprehension
— Unverified 0Cosine meets Softmax: A tough-to-beat baseline for visual grounding Sep 13, 2020 Autonomous Driving Metric Learning
Code Code Available 0AttnGrounder: Talking to Cars with Attention Sep 11, 2020 Referring Expression Comprehension Visual Grounding
Code Code Available 0Improving One-stage Visual Grounding by Recursive Sub-query Construction Aug 3, 2020 Sentence Sentence Embedding
Code Code Available 1