Visual Prompting

Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 76–100 of 127 papers

Title	Date	Tasks	Status
Cycle-Consistency Uncertainty Estimation for Visual Prompting based One-Shot Defect Segmentation	Sep 21, 2024	Defect DetectionVisual Prompting	—Unverified
DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement	Jul 11, 2024	Object RearrangementVisual Prompting	—Unverified
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models	May 29, 2025	Visual Prompting	—Unverified
EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery	Apr 17, 2025	Large Language ModelMulti-Task Learning	—Unverified
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting	Sep 19, 2024	DecoderObject	—Unverified
Explore until Confident: Efficient Exploration for Embodied Question Answering	Mar 23, 2024	Conformal PredictionEfficient Exploration	—Unverified
Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following	Jun 6, 2024	In-Context LearningVisual Prompting	—Unverified
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms	Oct 24, 2024	DiversityLanguage Modeling	—Unverified
From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework	Feb 12, 2025	Code GenerationRAG	—Unverified
FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training	Oct 10, 2022	Few-Shot Object Detectionobject-detection	—Unverified
FVP: Fourier Visual Prompting for Source-Free Unsupervised Domain Adaptation of Medical Image Segmentation	Apr 26, 2023	Domain AdaptationImage Segmentation	—Unverified
Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering	May 30, 2025	Language ModelingLanguage Modelling	—Unverified
GSON: A Group-based Social Navigation Framework with Large Multimodal Model	Sep 26, 2024	Autonomous VehiclesMotion Planning	—Unverified
ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation	Aug 2, 2023	Image ManipulationPose Transfer	—Unverified
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?	Apr 2, 2025	Action RecognitionAll	—Unverified
KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation	Mar 13, 2025	ObjectVisual Prompting	—Unverified
LaViP:Language-Grounded Visual Prompts	Dec 18, 2023	Few-Shot LearningTransfer Learning	—Unverified
Learning Expressive Prompting With Residuals for Vision Transformers	Mar 27, 2023	Few-Shot Learningimage-classification	—Unverified
Learning Visual Prompts for Guiding the Attention of Vision Transformers	Jun 5, 2024	Visual Prompting	—Unverified
MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention	Jan 7, 2025	ClassificationFine-Grained Image Classification	—Unverified
Medical Visual Prompting (MVP): A Unified Framework for Versatile and High-Quality Medical Image Segmentation	Apr 1, 2024	Image SegmentationMedical Image Segmentation	—Unverified
MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models	Nov 27, 2024	Person SearchVisual Prompting	—Unverified
MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting	Mar 5, 2024	In-Context LearningObject Rearrangement	—Unverified
MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks	May 13, 2024	image-classificationImage Classification	—Unverified
NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation	Apr 20, 2025	3D Instance Segmentation3D Open-Vocabulary Instance Segmentation	—Unverified

Show:10 25 50

← PrevPage 4 of 6Next →

No leaderboard results yet.