SOTAVerified

Visual Prompting

Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.

Papers

Showing 2650 of 127 papers

TitleStatusHype
EarthMarker: A Visual Prompting Multi-modal Large Language Model for Remote SensingCode1
Selective Visual Prompting in Vision MambaCode1
Scaffolding Coordinates to Promote Vision-Language Coordination in Large Multi-Modal ModelsCode1
Open-Vocabulary Action Localization with Iterative Visual PromptingCode1
Tune-An-Ellipse: CLIP Has Potential to Find What You WantCode1
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale ApproachCode1
Finding Visual Task VectorsCode1
Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction TuningCode1
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model AdaptationCode1
Visual Instruction Inversion: Image Editing via Visual PromptingCode1
EZ-CLIP: Efficient Zeroshot Video Action RecognitionCode1
By My Eyes: Grounding Multimodal Large Language Models with Sensor Data via Visual PromptingCode1
Fine-Grained Visual PromptingCode1
ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNetCode1
Exploring the Transferability of Visual Prompting for Multimodal Large Language ModelsCode1
OT-VP: Optimal Transport-guided Visual Prompting for Test-Time AdaptationCode1
Improving Visual Object Tracking through Visual PromptingCode1
GeoSAM: Fine-tuning SAM with Multi-Modal Prompts for Mobility Infrastructure SegmentationCode1
UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose TransferCode1
Vision Graph Prompting via Semantic Low-Rank DecompositionCode1
Visual Prompting for Adversarial RobustnessCode1
Towards Universal Text-driven CT Image SegmentationCode0
Towards Online Multi-Modal Social Interaction UnderstandingCode0
UICrit: Enhancing Automated Design Evaluation with a UICritique DatasetCode0
Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual PromptingCode0
Show:102550
← PrevPage 2 of 6Next →

No leaderboard results yet.