SOTAVerified

Referring Expression

Referring expressions places a bounding box around the instance corresponding to the provided description and image.

Papers

Showing 251300 of 364 papers

TitleStatusHype
Scene-Intuitive Agent for Remote Embodied Visual Grounding0
See-Through-Text Grouping for Referring Image Segmentation0
SegLLM: Multi-round Reasoning Segmentation0
Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO0
Semi-automatic definite description annotation: a first report0
SemScribe: Natural Language Generation for Medical Reports0
Specificity measures and reference0
Squib: Effects of Cognitive Effort on the Resolution of Overspecified Descriptions0
Statistical NLG for Generating the Content and Form of Referring Expressions0
SUGAR: Pre-training 3D Visual Representations for Robotics0
Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input0
Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks0
Synthetic Visual Genome0
Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding0
Text Augmented Spatial-aware Zero-shot Referring Image Segmentation0
Text-driven Affordance Learning from Egocentric Vision0
The Methodius Corpus of Rhetorical Discourse Structures and Generated Texts0
The Pipeline Model for Resolution of Anaphoric Reference and Resolution of Entity Reference0
The Solution for the 5th GCAIAC Zero-shot Referring Expression Comprehension Challenge0
The WebNLG Challenge: Generating Text from RDF Data0
Toward Forgetting-Sensitive Referring Expression Generationfor Integrated Robot Architectures0
Towards Situated Dialogue: Revisiting Referring Expression Generation0
Trainable Referring Expression Generation using Overspecification Preferences0
Transcrib3D: 3D Referring Expression Resolution through Large Language Models0
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks0
UNITER: Learning UNiversal Image-TExt Representations0
Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching0
Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos0
Using Lexical Alignment and Referring Ability to Address Data Sparsity in Situated Dialog Reference Resolution0
Using Referring Expression Generation to Model Literary Style0
Utilizing Every Image Object for Semi-supervised Phrase Grounding0
Variational Context: Exploiting Visual and Textual Context for Grounding Referring Expressions0
Video Referring Expression Comprehension via Transformer with Content-aware Query0
Video Referring Expression Comprehension via Transformer with Content-conditioned Query0
Viewpoint-Aware Visual Grounding in 3D Scenes0
Visual Question Answering based on Local-Scene-Aware Referring Expression Generation0
VLN BERT: A Recurrent Vision-and-Language BERT for Navigation0
VL-NMS: Breaking Proposal Bottlenecks in Two-Stage Visual-Language Matching0
VQD: Visual Query Detection in Natural Scenes0
WaterVG: Waterway Visual Grounding based on Text-Guided Vision and mmWave Radar0
Weakly-supervised segmentation of referring expressions0
What can Neural Referential Form Selectors Learn?0
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension0
Recurrent Instance Segmentation using Sequences of Referring Expressions0
RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension0
RefCrowd: Grounding the Target in Crowd with Referring Expressions0
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions0
Reference production in human-computer interaction: Issues for Corpus-based Referring Expression Generation0
Refer-iTTS: A System for Referring in Spoken Installments to Objects in Real-World Images0
Reasoning About Pragmatics with Neural Listeners and SpeakersCode0
Show:102550
← PrevPage 6 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RandomAcc@0.5m14.6Unverified