SOTAVerified

Visual Commonsense Reasoning

Papers

Showing 4150 of 65 papers

TitleStatusHype
Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks0
Attention Mechanism based Cognition-level Scene Understanding0
VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language TransformersCode0
Joint Answering and Explanation for Visual Commonsense ReasoningCode0
CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks0
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound0
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning0
Interpretable Visual Understanding with Cognitive Attention NetworkCode0
Cognitive Visual Commonsense Reasoning Using Dynamic Working MemoryCode0
Premise-based Multimodal Reasoning: Conditional Inference on Joint Textual and Visual Clues0
Show:102550
← PrevPage 5 of 7Next →

No leaderboard results yet.