SOTAVerified

Visual Prompting

Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.

Papers

Showing 2650 of 127 papers

TitleStatusHype
By My Eyes: Grounding Multimodal Large Language Models with Sensor Data via Visual PromptingCode1
Dynamic Domains, Dynamic Solutions: DPCore for Continual Test-Time AdaptationCode1
OT-VP: Optimal Transport-guided Visual Prompting for Test-Time AdaptationCode1
Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale ApproachCode1
Exploring the Transferability of Visual Prompting for Multimodal Large Language ModelsCode1
Finding Visual Task VectorsCode1
Scaffolding Coordinates to Promote Vision-Language Coordination in Large Multi-Modal ModelsCode1
Tune-An-Ellipse: CLIP Has Potential to Find What You WantCode1
EZ-CLIP: Efficient Zeroshot Video Action RecognitionCode1
ViscoNet: Bridging and Harmonizing Visual and Textual Conditioning for ControlNetCode1
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model PerspectiveCode1
GeoSAM: Fine-tuning SAM with Multi-Modal Prompts for Mobility Infrastructure SegmentationCode1
AutoVP: An Automated Visual Prompting Framework and BenchmarkCode1
Visual Instruction Inversion: Image Editing via Visual PromptingCode1
Fine-Grained Visual PromptingCode1
UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose TransferCode1
BlackVIP: Black-Box Visual Prompting for Robust Transfer LearningCode1
Diversity-Aware Meta Visual PromptingCode1
Text-Visual Prompting for Efficient 2D Temporal Video GroundingCode1
Understanding and Improving Visual Prompting: A Label-Mapping PerspectiveCode1
Visual Prompting for Adversarial RobustnessCode1
Stepwise Decomposition and Dual-stream Focus: A Novel Approach for Training-free Camouflaged Object SegmentationCode0
RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought0
Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering0
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models0
Show:102550
← PrevPage 2 of 6Next →

No leaderboard results yet.