SOTAVerified|Agents Browse Leaderboard About Blog

Visual Prompting

Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 127 papers

Title	Date	Tasks	Status
Towards Open-World Grasping with Large Vision-Language Models	Jun 26, 2024	Robotic GraspingVisual Grounding	—Unverified
Towards Robust and Accurate Visual Prompting	Nov 18, 2023	Adversarial RobustnessTransfer Learning	—Unverified
T-Rex: Counting by Visual Prompting	Nov 22, 2023	ObjectObject Counting	—Unverified
Tumor segmentation on whole slide images: training or prompting?	Feb 21, 2024	Computational EfficiencySegmentation	—Unverified
Zoomer: Adaptive Image Focus Optimization for Black-box MLLM	Apr 30, 2025	Image CaptioningObject Recognition	—Unverified
3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V	Dec 15, 2023	3D Object Detectionobject-detection	—Unverified
3DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o	Mar 17, 2025	Logical ReasoningPrompt Engineering	—Unverified
A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis	May 29, 2025	DiagnosticVisual Prompting	—Unverified
Affordance-Guided Reinforcement Learning via Visual Prompting	Jul 14, 2024	reinforcement-learningReinforcement Learning	—Unverified
Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model	May 16, 2024	Image InpaintingIn-Context Learning	—Unverified
Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling	Feb 4, 2025	ObjectVisual Prompting	—Unverified
Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models	Apr 30, 2025	HallucinationObject	—Unverified
BLINK: Multimodal Large Language Models Can See but Not Perceive	Apr 18, 2024	Depth EstimationMultiple-choice	—Unverified
Chat2Layout: Interactive 3D Furniture Layout with a Multimodal LLM	Jul 31, 2024	In-Context LearningLayout Design	—Unverified
Cycle-Consistency Uncertainty Estimation for Visual Prompting based One-Shot Defect Segmentation	Sep 21, 2024	Defect DetectionVisual Prompting	—Unverified
DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement	Jul 11, 2024	Object RearrangementVisual Prompting	—Unverified
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models	May 29, 2025	Visual Prompting	—Unverified
EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery	Apr 17, 2025	Large Language ModelMulti-Task Learning	—Unverified
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting	Sep 19, 2024	DecoderObject	—Unverified
Explore until Confident: Efficient Exploration for Embodied Question Answering	Mar 23, 2024	Conformal PredictionEfficient Exploration	—Unverified
Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following	Jun 6, 2024	In-Context LearningVisual Prompting	—Unverified
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms	Oct 24, 2024	DiversityLanguage Modeling	—Unverified
From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework	Feb 12, 2025	Code GenerationRAG	—Unverified
FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training	Oct 10, 2022	Few-Shot Object Detectionobject-detection	—Unverified
FVP: Fourier Visual Prompting for Source-Free Unsupervised Domain Adaptation of Medical Image Segmentation	Apr 26, 2023	Domain AdaptationImage Segmentation	—Unverified

Show:10 25 50

← PrevPage 3 of 6Next →

No leaderboard results yet.