SOTAVerified

Referring Expression

Referring expressions places a bounding box around the instance corresponding to the provided description and image.

Papers

Showing 5175 of 364 papers

TitleStatusHype
Human-centric Spatio-Temporal Video Grounding With Visual TransformersCode1
CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression SegmentationCode1
Multi-task Collaborative Network for Joint Referring Expression Comprehension and SegmentationCode1
Improving Visual Grounding by Encouraging Consistent Gradient-based ExplanationsCode1
A Unified Framework for 3D Point Cloud Visual GroundingCode1
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image SegmentationCode1
Relationship-Embedded Representation Learning for Grounding Referring ExpressionsCode1
Airbert: In-domain Pretraining for Vision-and-Language NavigationCode1
RefEgo: Referring Expression Comprehension Dataset from First-Person Perception of Ego4DCode1
Refer360^: A Referring Expression Recognition Dataset in 360^ ImagesCode1
Kosmos-2: Grounding Multimodal Large Language Models to the WorldCode1
Multi-branch Collaborative Learning Network for 3D Visual GroundingCode1
3D-GRES: Generalized 3D Referring Expression SegmentationCode1
Multi-modal Instruction Tuned LLMs with Fine-grained Visual PerceptionCode1
Multi-task Visual Grounding with Coarse-to-Fine Consistency ConstraintsCode1
Discriminative Triad Matching and Reconstruction for Weakly Referring Expression GroundingCode1
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression ComprehensionCode1
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity RecognitionCode1
Exploring Contextual Attribute Density in Referring Expression CountingCode1
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression ComprehensionCode1
March in Chat: Interactive Prompting for Remote Embodied Referring ExpressionCode1
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression ComprehensionCode1
Modeling Context in Referring ExpressionsCode1
Exploring Fine-Grained Image-Text Alignment for Referring Remote Sensing Image SegmentationCode1
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLMCode1
Show:102550
← PrevPage 3 of 15Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RandomAcc@0.5m14.6Unverified