Cost-Effective Language Driven Image Editing with LX-DRIM Oct 1, 2022 Visual Grounding
Code Code Available 0Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding Sep 28, 2022 Decoder Visual Grounding
— Unverified 0Introspective Learning : A Two-Stage Approach for Inference in Neural Networks Sep 17, 2022 Active Learning Decision Making
Code Code Available 0Visual Grounding of Inter-lingual Word-Embeddings Sep 8, 2022 Visual Grounding Word Embeddings
— Unverified 0VLMAE: Vision-Language Masked Autoencoder Aug 19, 2022 Image-text Retrieval Language Modeling
— Unverified 0SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding Jul 27, 2022 Visual Grounding
Code Code Available 0Toward Explainable and Fine-Grained 3D Grounding through Referring Textual Phrases Jul 5, 2022 Object Representation Learning
— Unverified 0RoViST: Learning Robust Metrics for Visual Storytelling Jul 1, 2022 Sentence Text Generation
Code Code Available 0How direct is the link between words and images? Jun 30, 2022 Visual Grounding Word Embeddings
— Unverified 0Tell Me the Evidence? Dual Visual-Linguistic Interaction for Answer Grounding Jun 21, 2022 Decoder Question Answering
— Unverified 0Bear the Query in Mind: Visual Grounding with Query-conditioned Convolution Jun 18, 2022 Visual Grounding
— Unverified 0Language with Vision: a Study on Grounded Word and Sentence Embeddings Jun 17, 2022 Sentence Sentence Embeddings
Code Code Available 0Guiding Visual Question Answering with Attention Priors May 25, 2022 Question Answering Visual Grounding
— Unverified 0Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution May 24, 2022 Domain Adaptation Visual Grounding
— Unverified 0Weakly-supervised segmentation of referring expressions May 10, 2022 Image Segmentation Referring Expression
— Unverified 0RoViST:Learning Robust Metrics for Visual Storytelling May 8, 2022 Sentence Text Generation
Code Code Available 0Flexible Visual Grounding May 1, 2022 Articles Visual Grounding
Code Code Available 0Attention as Grounding: Exploring Textual and Cross-Modal Attention on Entities and Relations in Language-and-Vision Transformer May 1, 2022 Text Generation Visual Grounding
Code Code Available 0To Find Waldo You Need Contextual Cues: Debiasing Who’s Waldo May 1, 2022 Benchmarking Person-centric Visual Grounding
Code Code Available 0FindIt: Generalized Localization with Natural Language Queries Mar 31, 2022 Natural Language Queries Object
— Unverified 0To Find Waldo You Need Contextual Cues: Debiasing Who's Waldo Mar 30, 2022 Benchmarking Person-centric Visual Grounding
Code Code Available 0Suspected Object Matters: Rethinking Model's Prediction for One-stage Visual Grounding Mar 10, 2022 Object Visual Grounding
— Unverified 0Seeing the advantage: visually grounding word embeddings to better capture human semantic knowledge Feb 21, 2022 Grounded language learning Image Retrieval
— Unverified 0OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework Feb 7, 2022 Image Captioning image-classification
Code Code Available 03DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds Jan 1, 2022 3D dense captioning Attribute
— Unverified 0Deconfounded Visual Grounding Dec 31, 2021 Referring Expression Visual Grounding
Code Code Available 0RoViST: Learning Robust Metrics for Visual Storytelling Dec 17, 2021 Sentence Text Generation
— Unverified 0D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding Dec 2, 2021 3D dense captioning 3D visual grounding
— Unverified 0Less is More: Generating Grounded Navigation Instructions from Landmarks Nov 25, 2021 Decoder Instruction Following
— Unverified 0Zero-Shot Visual Grounding of Referring Utterances in Dialogue Nov 16, 2021 Descriptive Visual Grounding
— Unverified 0Attention as Grounding: Exploring Textual and Cross-Modal Attention on Entities and Relations in Language-and-Vision Transformer Oct 16, 2021 Text Generation Visual Grounding
— Unverified 0Efficient Multi-Modal Embeddings from Structured Data Oct 6, 2021 Semantic Similarity Semantic Textual Similarity
— Unverified 0Discovering the Unknown Knowns: Turning Implicit Knowledge in the Dataset into Explicit Training Examples for Visual Question Answering Sep 13, 2021 Data Augmentation Question Answering
Code Code Available 0Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models Sep 8, 2021 Concept-To-Text Generation Specificity
— Unverified 0INVIGORATE: Interactive Visual Grounding and Grasping in Clutter Aug 25, 2021 Blocking Object
— Unverified 0A Better Loss for Visual-Textual Grounding Aug 11, 2021 Sentence Visual Grounding
Code Code Available 0TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding Aug 5, 2021 3D visual grounding Relation
— Unverified 0Attending Self-Attention: A Case Study of Visually Grounded Supervision in Vision-and-Language Transformers Aug 1, 2021 Language Modeling Language Modelling
— Unverified 0Word2Pix: Word to Pixel Cross Attention Transformer in Visual Grounding Jul 31, 2021 Decoder Sentence
— Unverified 0LanguageRefer: Spatial-Language Model for 3D Visual Grounding Jul 7, 2021 3D visual grounding Language Modeling
— Unverified 0Adventurer's Treasure Hunt: A Transparent System for Visually Grounded Compositional Visual Question Answering based on Scene Graphs Jun 28, 2021 Question Answering Task 2
— Unverified 0AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training Jun 19, 2021 Visual Grounding
— Unverified 0Attention-Based Keyword Localisation in Speech using Visual Grounding Jun 16, 2021 Visual Grounding
— Unverified 0Semantic sentence similarity: size does not always matter Jun 16, 2021 Grounded language learning Image Retrieval
— Unverified 0Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation May 24, 2021 Referring Expression Referring Expression Comprehension
Code Code Available 0Visual Grounding Strategies for Text-Only Natural Language Processing Mar 25, 2021 Image Retrieval Language Modeling
— Unverified 0Scene-Intuitive Agent for Remote Embodied Visual Grounding Mar 24, 2021 cross-modal alignment Navigate
— Unverified 0Decoupled Spatial Temporal Graphs for Generic Visual Grounding Mar 18, 2021 Contrastive Learning Visual Grounding
— Unverified 0Few-Shot Visual Grounding for Natural Human-Robot Interaction Mar 17, 2021 Visual Grounding
— Unverified 0Composing Pick-and-Place Tasks By Grounding Language Feb 16, 2021 Natural Language Visual Grounding Robotic Grasping
Code Code Available 0