SOTAVerified

Visual Prompting

Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.

Papers

Showing 76100 of 127 papers

TitleStatusHype
Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering0
GSON: A Group-based Social Navigation Framework with Large Multimodal Model0
ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation0
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?0
KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation0
LaViP:Language-Grounded Visual Prompts0
Learning Expressive Prompting With Residuals for Vision Transformers0
Learning Visual Prompts for Guiding the Attention of Vision Transformers0
MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention0
Medical Visual Prompting (MVP): A Unified Framework for Versatile and High-Quality Medical Image Segmentation0
MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models0
MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting0
MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks0
NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation0
On the low-shot transferability of [V]-Mamba0
Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting0
Personalization Toolkit: Training Free Personalization of Large Vision Language Models0
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs0
Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval0
Prompting the Unseen: Detecting Hidden Backdoors in Black-Box Models0
Query Efficient Black-Box Visual Prompting with Subspace Learning0
RadSAM: Segmenting 3D radiological images with a 2D promptable model0
Rethinking Sparse Lexical Representations for Image Retrieval in the Age of Rising Multi-Modal Large Language Models0
Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge0
RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics0
Show:102550
← PrevPage 4 of 6Next →

No leaderboard results yet.