SOTAVerified

Referring Expression Comprehension

Papers

Showing 1120 of 167 papers

TitleStatusHype
Towards Visual Grounding: A SurveyCode3
General Object Foundation Model for Images and Videos at ScaleCode3
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile DevicesCode3
ONE-PEACE: Exploring One General Representation Model Toward Unlimited ModalitiesCode3
Universal Instance Perception as Object Discovery and RetrievalCode3
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal FusionCode2
GREC: Generalized Referring Expression ComprehensionCode2
TextRegion: Text-Aligned Region Tokens from Frozen Image-Text ModelsCode2
Elysium: Exploring Object-level Perception in Videos via MLLMCode2
MDETR - Modulated Detection for End-to-End Multi-Modal UnderstandingCode2
Show:102550
← PrevPage 2 of 17Next →

No leaderboard results yet.