SOTAVerified

Referring Expression

Referring expressions places a bounding box around the instance corresponding to the provided description and image.

Papers

Showing 301325 of 364 papers

TitleStatusHype
Self-paced Multi-grained Cross-modal Interaction Modeling for Referring Expression Comprehension0
Referring Expression Generation and Comprehension via Attributes0
Reasoning About Pragmatics with Neural Listeners and SpeakersCode0
Pento-DIARef: A Diagnostic Dataset for Learning the Incremental Algorithm for Referring Expression Generation from ExamplesCode0
CLEVR-Ref+: Diagnosing Visual Reasoning with Referring ExpressionsCode0
CK-Transformer: Commonsense Knowledge Enhanced Transformers for Referring Expression ComprehensionCode0
NeuralREG: An end-to-end approach to referring expression generationCode0
Grounding Language in Multi-Perspective Referential CommunicationCode0
Referring Expression Comprehension Using Language Adaptive InferenceCode0
Modulating Bottom-Up and Top-Down Visual Processing via Language-Conditional FiltersCode0
Modeling Context Between Objects for Referring Expression UnderstandingCode0
Giving Commands to a Self-Driving Car: How to Deal with Uncertain Situations?Code0
Enhancing Interpretability and Interactivity in Robot Manipulation: A Neurosymbolic ApproachCode0
Referring Expression Generation in Visually Grounded Dialogue with Discourse-aware Comprehension GuidingCode0
Using Syntax to Ground Referring Expressions in Natural ImagesCode0
Referring Expression Generation Using Entity ProfilesCode0
Generation and Comprehension of Unambiguous Object DescriptionsCode0
Yes, this Way! Learning to Ground Referring Expressions into Actions with Intra-episodic Feedback from Supportive TeachersCode0
Referring Expression Object Segmentation with Caption-Aware ConsistencyCode0
A Real-time Global Inference Network for One-stage Referring Expression ComprehensionCode0
Improving Quality and Efficiency in Plan-based Neural Data-to-Text GenerationCode0
Adversarial Robustness for Visual Grounding of Multimodal Large Language ModelsCode0
WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and SegmentationCode0
MB-ORES: A Multi-Branch Object Reasoner for Visual Grounding in Remote SensingCode0
Exploring Modulated Detection Transformer as a Tool for Action Recognition in VideosCode0
Show:102550
← PrevPage 13 of 15Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RandomAcc@0.5m14.6Unverified