SOTAVerified

Visual Prompting

Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.

Papers

Showing 51100 of 127 papers

TitleStatusHype
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models0
VP Lab: a PEFT-Enabled Visual Prompting Laboratory for Semantic Segmentation0
Zoomer: Adaptive Image Focus Optimization for Black-box MLLM0
Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models0
RadSAM: Segmenting 3D radiological images with a 2D promptable model0
Visual and textual prompts for enhancing emotion recognition in video0
NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation0
Visual Prompting for One-shot Controllable Video Editing without Inversion0
EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery0
Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval0
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?0
Towards Online Multi-Modal Social Interaction UnderstandingCode0
VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data Synthesis0
3DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o0
KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation0
Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth AmbiguityCode0
Towards Universal Text-driven CT Image SegmentationCode0
The Role of Background Information in Reducing Object Hallucination in Vision-Language Models: Insights from Cutoff API Prompting0
From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework0
Personalization Toolkit: Training Free Personalization of Large Vision Language Models0
Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling0
IP-Prompter: Training-Free Theme-Specific Image Generation via Dynamic Visual PromptingCode0
MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention0
Query Efficient Black-Box Visual Prompting with Subspace Learning0
Visual Prompting with Iterative Refinement for Design Critique Generation0
Test-time Correction with Human Feedback: An Online 3D Detection System via Visual Prompting0
MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models0
Prompting the Unseen: Detecting Hidden Backdoors in Black-Box Models0
WeatherGFM: Learning A Weather Generalist Foundation Model via In-context Learning0
Benchmarking Human and Automated Prompting in the Segment Anything ModelCode0
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms0
Visual Prompting in LLMs for Enhancing Emotion Recognition0
GSON: A Group-based Social Navigation Framework with Large Multimodal Model0
Cycle-Consistency Uncertainty Estimation for Visual Prompting based One-Shot Defect Segmentation0
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting0
Visual Prompting in Multimodal Large Language Models: A Survey0
When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood PerspectiveCode0
Rethinking Sparse Lexical Representations for Image Retrieval in the Age of Rising Multi-Modal Large Language Models0
Targeted Visual Prompting for Medical Visual Question AnsweringCode0
Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model0
Chat2Layout: Interactive 3D Furniture Layout with a Multimodal LLM0
Affordance-Guided Reinforcement Learning via Visual Prompting0
UICrit: Enhancing Automated Design Evaluation with a UICritique DatasetCode0
DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement0
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge0
Robust Adaptation of Foundation Models with Black-Box Visual Prompting0
Towards Open-World Grasping with Large Vision-Language Models0
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics0
Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following0
Learning Visual Prompts for Guiding the Attention of Vision Transformers0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.