SOTAVerified

Referring Expression

Referring expressions places a bounding box around the instance corresponding to the provided description and image.

Papers

Showing 76100 of 364 papers

TitleStatusHype
Graph-Structured Referring Expression Reasoning in The WildCode1
Image Segmentation Using Text and Image PromptsCode1
CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression SegmentationCode1
PixFoundation: Are We Heading in the Right Direction with Pixel-level Vision Foundation Models?Code1
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLMCode1
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity RecognitionCode1
Described Object Detection: Liberating Object Detection with Flexible ExpressionsCode1
3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression SegmentationCode1
Exploring Contextual Attribute Density in Referring Expression CountingCode1
Exploring Contextual Attribute Density in Referring Expression CountingCode1
Exploring Fine-Grained Image-Text Alignment for Referring Remote Sensing Image SegmentationCode1
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEsCode1
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMsCode1
Kosmos-2: Grounding Multimodal Large Language Models to the WorldCode1
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression SegmentationCode1
LAVT: Language-Aware Vision Transformer for Referring Image SegmentationCode1
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression ComprehensionCode1
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression ComprehensionCode1
Layout-aware Dreamer for Embodied Referring Expression GroundingCode1
Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression GroundingCode1
The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary TextsCode1
March in Chat: Interactive Prompting for Remote Embodied Referring ExpressionCode1
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression ComprehensionCode1
A Fast and Accurate One-Stage Approach to Visual GroundingCode1
VL-BERT: Pre-training of Generic Visual-Linguistic RepresentationsCode1
Show:102550
← PrevPage 4 of 15Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RandomAcc@0.5m14.6Unverified