SOTAVerified

Referring Expression Comprehension

Papers

Showing 5175 of 167 papers

TitleStatusHype
RefDrone: A Challenging Benchmark for Referring Expression Comprehension in Drone ScenesCode1
PolyFormer: Referring Image Segmentation as Sequential Polygon GenerationCode1
RefEgo: Referring Expression Comprehension Dataset from First-Person Perception of Ego4DCode1
NS3D: Neuro-Symbolic Grounding of 3D Objects and RelationsCode1
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression ComprehensionCode1
An Open and Comprehensive Pipeline for Unified Object Grounding and DetectionCode1
Improving Visual Grounding by Encouraging Consistent Gradient-based ExplanationsCode1
InstructDET: Diversifying Referring Object Detection with Generalized InstructionsCode1
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression ComprehensionCode1
Multi-task Visual Grounding with Coarse-to-Fine Consistency ConstraintsCode1
DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and GroundingCode1
Multi-branch Collaborative Learning Network for 3D Visual GroundingCode1
Referring Transformer: A One-step Approach to Multi-task Visual GroundingCode1
Talk2Car: Taking Control of Your Self-Driving CarCode1
Large-Scale Adversarial Training for Vision-and-Language Representation LearningCode1
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM CollaborationCode1
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language ModelsCode1
Learning to Evaluate Performance of Multi-modal Semantic LocalizationCode1
UNITER: UNiversal Image-TExt Representation LearningCode1
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression ComprehensionCode1
Differentiated Relevances Embedding for Group-based Referring Expression Comprehension0
ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments0
Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos0
Modular Graph Attention Network for Complex Visual Relational Reasoning0
MUTATT: Visual-Textual Mutual Guidance for Referring Expression Comprehension0
Show:102550
← PrevPage 3 of 7Next →

No leaderboard results yet.