SOTAVerified|Agents Browse Leaderboard About Blog

visual instruction following

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 24 papers

Title	Date	Tasks	Status	Hype	Score
Text as Images: Can Multimodal Large Language Models Follow Printed Instructions in Pixels?	Nov 29, 2023	In-Context LearningInstruction Following	CodeCode Available	1	5
MpoxVLM: A Vision-Language Model for Diagnosing Skin Lesions from Mpox Virus Infection	Nov 16, 2024	DiagnosticInstruction Following	CodeCode Available	0	5
Instruction Clarification Requests in Multimodal Collaborative Dialogue Games: Tasks, and an Analysis of the CoDraw Dataset	Feb 28, 2023	Instruction Followingvisual instruction following	CodeCode Available	0	5
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions	Nov 21, 2023	DescriptiveMME	CodeCode Available	0	5
Joint Embeddings for Graph Instruction Tuning	May 31, 2024	Instruction Followingvisual instruction following	—Unverified	0	0
Do we Really Need Visual Instructions? Towards Visual Instruction-Free Fine-tuning for Large Vision-Language Models	Feb 17, 2025	Instruction Followingvisual instruction following	—Unverified	0	0
FaceGPT: Self-supervised Learning to Chat about 3D Human Faces	Jun 11, 2024	3D Face ReconstructionFace Model	—Unverified	0	0
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning	May 16, 2024	Decision MakingInstruction Following	—Unverified	0	0
Space-LLaVA: a Vision-Language Model Adapted to Extraterrestrial Applications	Aug 12, 2024	Instruction FollowingLanguage Modeling	—Unverified	0	0
LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition	Jul 9, 2024	Instruction FollowingRepresentation Learning	—Unverified	0	0

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.