Visual Prompting

Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 76–100 of 127 papers

Title	Date	Tasks	Status	Hype
BLINK: Multimodal Large Language Models Can See but Not Perceive	Apr 18, 2024	Depth EstimationMultiple-choice	—Unverified	0
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach	Apr 17, 2024	DecoderGeneralized Few-Shot Semantic Segmentation	CodeCode Available	1
Exploring the Transferability of Visual Prompting for Multimodal Large Language Models	Apr 17, 2024	HallucinationMultimodal Reasoning	CodeCode Available	1
Finding Visual Task Vectors	Apr 8, 2024	Visual Prompting	CodeCode Available	1
Medical Visual Prompting (MVP): A Unified Framework for Versatile and High-Quality Medical Image Segmentation	Apr 1, 2024	Image SegmentationMedical Image Segmentation	—Unverified	0
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want	Mar 29, 2024	Instruction FollowingLanguage Modelling	CodeCode Available	2
Explore until Confident: Efficient Exploration for Embodied Question Answering	Mar 23, 2024	Conformal PredictionEfficient Exploration	—Unverified	0
On the low-shot transferability of [V]-Mamba	Mar 15, 2024	Few-Shot LearningMamba	—Unverified	0
MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting	Mar 5, 2024	In-Context LearningObject Rearrangement	—Unverified	0
Tumor segmentation on whole slide images: training or prompting?	Feb 21, 2024	Computational EfficiencySegmentation	—Unverified	0
Scaffolding Coordinates to Promote Vision-Language Coordination in Large Multi-Modal Models	Feb 19, 2024	Visual Prompting	CodeCode Available	1
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs	Feb 12, 2024	Instruction FollowingLogical Reasoning	—Unverified	0
Tune-An-Ellipse: CLIP Has Potential to Find What You Want	Jan 1, 2024	ObjectReferring Expression	CodeCode Available	1
Generative Multimodal Models are In-Context Learners	Dec 20, 2023	In-Context LearningPersonalized Image Generation	CodeCode Available	3
LaViP:Language-Grounded Visual Prompts	Dec 18, 2023	Few-Shot LearningTransfer Learning	—Unverified	0
3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V	Dec 15, 2023	3D Object Detectionobject-detection	—Unverified	0
Tokenize Anything via Prompting	Dec 14, 2023	DecoderVisual Prompting	CodeCode Available	2
EZ-CLIP: Efficient Zeroshot Video Action Recognition	Dec 13, 2023	Action RecognitionGPU	CodeCode Available	1
ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNet	Dec 5, 2023	Image GenerationPerson Re-Identification	CodeCode Available	1
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective	Dec 3, 2023	Image ClassificationVisual Prompting	CodeCode Available	1
ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts	Dec 1, 2023	Visual Commonsense ReasoningVisual Prompting	CodeCode Available	0
T-Rex: Counting by Visual Prompting	Nov 22, 2023	ObjectObject Counting	—Unverified	0
Visual In-Context Prompting	Nov 22, 2023	DecoderSegmentation	CodeCode Available	4
GeoSAM: Fine-tuning SAM with Multi-Modal Prompts for Mobility Infrastructure Segmentation	Nov 19, 2023	Image SegmentationLarge Language Model	CodeCode Available	1
Towards Robust and Accurate Visual Prompting	Nov 18, 2023	Adversarial RobustnessTransfer Learning	—Unverified	0

Show:10 25 50

← PrevPage 4 of 6Next →

No leaderboard results yet.