SOTAVerified

Referring Expression

Referring expressions places a bounding box around the instance corresponding to the provided description and image.

Papers

Showing 101150 of 364 papers

TitleStatusHype
IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word EmphasisCode1
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEsCode1
NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic ReasoningCode1
RefDrone: A Challenging Benchmark for Referring Expression Comprehension in Drone ScenesCode1
Improving Visual Grounding by Encouraging Consistent Gradient-based ExplanationsCode1
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLMCode1
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension0
Decoupling Pragmatics: Discriminative Decoding for Referring Expression Generation0
Bi-Directional Relationship Inferring Network for Referring Image Segmentation0
Beyond Object Categories: Multi-Attribute Reference Understanding for Visual Grounding0
A Commercial Perspective on Reference0
Decoding Strategies for Neural Referring Expression Generation0
Mask-aware Text-to-Image Retrieval: Referring Expression Segmentation Meets Cross-modal Retrieval0
Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models0
Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models0
A Unified Mutual Supervision Framework for Referring Expression Segmentation and Generation0
Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation0
Creating Training Corpora for NLG Micro-Planners0
Make Graph-based Referring Expression Comprehension Great Again through Expression-guided Dynamic Gating and Regression0
MaskInversion: Localized Embeddings via Optimization of Explainability Maps0
GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane0
Goal-driven text descriptions for images0
G-TUNA: a corpus of referring expressions in German, including duration information0
Harlequin: Color-driven Generation of Synthetic Data for Referring Expression Comprehension0
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding0
Augmenting Robot Knowledge Consultants with Distributed Short Term Memory0
A case study on context-bound referring expression generation0
Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge0
Getting to ``Hearer-old'': Charting Referring Expressions Across Time0
Corpus-based Referring Expressions Generation0
Gera \~ao de Express\~oes de Refer\^encia usando Rela \~oes Espaciais (Referring Expression Generation Using Spatial Relations) [in Portuguese]0
GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing0
Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension0
Assessing Neural Referential Form Selectors on a Realistic Multilingual Dataset0
M^2IST: Multi-Modal Interactive Side-Tuning for Efficient Referring Expression Comprehension0
Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training0
ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments0
Generating Texts with Integer Linear Programming0
Generating Quantified Referring Expressions through Attention-Driven Incremental Perception0
3DResT: A Strong Baseline for Semi-Supervised 3D Referring Expression Segmentation0
Generalizable Entity Grounding via Assistance of Large Language Model0
Fuzzy Logic for Vagueness Management in Referring Expression Generation0
Constructing Distributions of Variation in Referring Expression Type from Corpora for Model Evaluation0
Fully and Weakly Supervised Referring Expression Segmentation with End-to-End Learning0
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes0
CoNAN: A Complementary Neighboring-based Attention Network for Referring Expression Generation0
Look Hear: Gaze Prediction for Speech-directed Human Attention0
MAGNet: Multi-Region Attention-Assisted Grounding of Natural Language Queries at Phrase Level0
FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring Analysis0
Computational Interpretations of Recency for the Choice of Referring Expressions in Discourse0
Show:102550
← PrevPage 3 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RandomAcc@0.5m14.6Unverified