SOTAVerified

Referring Expression Comprehension

Papers

Showing 2650 of 167 papers

TitleStatusHype
Coarse-to-Fine Vision-Language Pre-training with Fusion in the BackboneCode1
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression ComprehensionCode1
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLMCode1
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression ComprehensionCode1
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and GroundingCode1
PolyFormer: Referring Image Segmentation as Sequential Polygon GenerationCode1
RefDrone: A Challenging Benchmark for Referring Expression Comprehension in Drone ScenesCode1
RefEgo: Referring Expression Comprehension Dataset from First-Person Perception of Ego4DCode1
SeqTR: A Simple yet Universal Network for Visual GroundingCode1
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM CollaborationCode1
Large-Scale Adversarial Training for Vision-and-Language Representation LearningCode1
NS3D: Neuro-Symbolic Grounding of 3D Objects and RelationsCode1
FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression ComprehensionCode1
Multi-task Collaborative Network for Joint Referring Expression Comprehension and SegmentationCode1
Kosmos-2: Grounding Multimodal Large Language Models to the WorldCode1
A Fast and Accurate One-Stage Approach to Visual GroundingCode1
Multi-branch Collaborative Learning Network for 3D Visual GroundingCode1
Multi-task Visual Grounding with Coarse-to-Fine Consistency ConstraintsCode1
A Unified Framework for 3D Point Cloud Visual GroundingCode1
GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEsCode1
Correspondence Matters for Video Referring Expression ComprehensionCode1
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity RecognitionCode1
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression ComprehensionCode1
Explainable Neural Computation via Stack Neural Module NetworksCode1
Improving Visual Grounding by Encouraging Consistent Gradient-based ExplanationsCode1
Show:102550
← PrevPage 2 of 7Next →

No leaderboard results yet.