SOTAVerified

Referring Expression Comprehension

Papers

Showing 5175 of 167 papers

TitleStatusHype
Compositional Attention Networks for Machine ReasoningCode1
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point CloudsCode1
Referring Transformer: A One-step Approach to Multi-task Visual GroundingCode1
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression ComprehensionCode1
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language ModelsCode1
RefDrone: A Challenging Benchmark for Referring Expression Comprehension in Drone ScenesCode1
Improving Visual Grounding by Encouraging Consistent Gradient-based ExplanationsCode1
InstructDET: Diversifying Referring Object Detection with Generalized InstructionsCode1
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM CollaborationCode1
Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and CaptionsCode1
LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression ComprehensionCode1
An Open and Comprehensive Pipeline for Unified Object Grounding and DetectionCode1
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression ComprehensionCode1
NS3D: Neuro-Symbolic Grounding of 3D Objects and RelationsCode1
Large-Scale Adversarial Training for Vision-and-Language Representation LearningCode1
Multi-task Visual Grounding with Coarse-to-Fine Consistency ConstraintsCode1
RefEgo: Referring Expression Comprehension Dataset from First-Person Perception of Ego4DCode1
Learning to Evaluate Performance of Multi-modal Semantic LocalizationCode1
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoECode1
PolyFormer: Referring Image Segmentation as Sequential Polygon GenerationCode1
Differentiated Relevances Embedding for Group-based Referring Expression Comprehension0
ArraMon: A Joint Navigation-Assembly Instruction Interpretation Task in Dynamic Environments0
Modular Graph Attention Network for Complex Visual Relational Reasoning0
Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos0
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension0
Show:102550
← PrevPage 3 of 7Next →

No leaderboard results yet.