DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models May 29, 2025 Visual Prompting
— Unverified 0VP Lab: a PEFT-Enabled Visual Prompting Laboratory for Semantic Segmentation May 21, 2025 parameter-efficient fine-tuning Semantic Segmentation
— Unverified 0Zoomer: Adaptive Image Focus Optimization for Black-box MLLM Apr 30, 2025 Image Captioning Object Recognition
— Unverified 0Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models Apr 30, 2025 Hallucination Object
— Unverified 0RadSAM: Segmenting 3D radiological images with a 2D promptable model Apr 29, 2025 Image Segmentation Medical Image Segmentation
— Unverified 0Visual and textual prompts for enhancing emotion recognition in video Apr 24, 2025 Emotion Recognition Video Emotion Recognition
— Unverified 0NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation Apr 20, 2025 3D Instance Segmentation 3D Open-Vocabulary Instance Segmentation
— Unverified 0Visual Prompting for One-shot Controllable Video Editing without Inversion Apr 19, 2025 Video Editing Visual Prompting
— Unverified 0EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery Apr 17, 2025 Large Language Model Multi-Task Learning
— Unverified 0Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval Apr 2, 2025 Image Retrieval Retrieval
— Unverified 0Is Temporal Prompting All We Need For Limited Labeled Action Recognition? Apr 2, 2025 Action Recognition All
— Unverified 0Towards Online Multi-Modal Social Interaction Understanding Mar 25, 2025 Visual Prompting
Code Code Available 0VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data Synthesis Mar 20, 2025 parameter-efficient fine-tuning Visual Prompting
— Unverified 03DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o Mar 17, 2025 Logical Reasoning Prompt Engineering
— Unverified 0KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation Mar 13, 2025 Object Visual Prompting
— Unverified 0Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth Ambiguity Mar 8, 2025 Depth Estimation Scene Understanding
Code Code Available 0Towards Universal Text-driven CT Image Segmentation Mar 8, 2025 Computed Tomography (CT) Contrastive Learning
Code Code Available 0The Role of Background Information in Reducing Object Hallucination in Vision-Language Models: Insights from Cutoff API Prompting Feb 21, 2025 Hallucination Object
— Unverified 0From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework Feb 12, 2025 Code Generation RAG
— Unverified 0Personalization Toolkit: Training Free Personalization of Large Vision Language Models Feb 4, 2025 RAG Retrieval
— Unverified 0Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling Feb 4, 2025 Object Visual Prompting
— Unverified 0IP-Prompter: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting Jan 26, 2025 Diffusion Personalization Diffusion Personalization Tuning Free
Code Code Available 0MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention Jan 7, 2025 Classification Fine-Grained Image Classification
— Unverified 0Query Efficient Black-Box Visual Prompting with Subspace Learning Jan 1, 2025 Prompt Learning Visual Prompting
— Unverified 0Visual Prompting with Iterative Refinement for Design Critique Generation Dec 22, 2024 Attribute Visual Prompting
— Unverified 0Test-time Correction with Human Feedback: An Online 3D Detection System via Visual Prompting Dec 10, 2024 Autonomous Driving Visual Prompting
— Unverified 0MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models Nov 27, 2024 Person Search Visual Prompting
— Unverified 0Prompting the Unseen: Detecting Hidden Backdoors in Black-Box Models Nov 14, 2024 Visual Prompting
— Unverified 0WeatherGFM: Learning A Weather Generalist Foundation Model via In-context Learning Nov 8, 2024 In-Context Learning Question Answering
— Unverified 0Benchmarking Human and Automated Prompting in the Segment Anything Model Oct 29, 2024 Benchmarking Image Segmentation
Code Code Available 0Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Oct 24, 2024 Diversity Language Modeling
— Unverified 0Visual Prompting in LLMs for Enhancing Emotion Recognition Oct 3, 2024 Emotion Recognition Visual Prompting
— Unverified 0GSON: A Group-based Social Navigation Framework with Large Multimodal Model Sep 26, 2024 Autonomous Vehicles Motion Planning
— Unverified 0Cycle-Consistency Uncertainty Estimation for Visual Prompting based One-Shot Defect Segmentation Sep 21, 2024 Defect Detection Visual Prompting
— Unverified 0End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting Sep 19, 2024 Decoder Object
— Unverified 0Visual Prompting in Multimodal Large Language Models: A Survey Sep 5, 2024 In-Context Learning Prompt Learning
— Unverified 0When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood Perspective Sep 3, 2024 Transfer Learning Visual Prompting
Code Code Available 0Rethinking Sparse Lexical Representations for Image Retrieval in the Age of Rising Multi-Modal Large Language Models Aug 29, 2024 Data Augmentation Image Retrieval
— Unverified 0Targeted Visual Prompting for Medical Visual Question Answering Aug 6, 2024 Medical Visual Question Answering Question Answering
Code Code Available 0Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model Aug 1, 2024 EgoSchema Language Modeling
— Unverified 0Chat2Layout: Interactive 3D Furniture Layout with a Multimodal LLM Jul 31, 2024 In-Context Learning Layout Design
— Unverified 0Affordance-Guided Reinforcement Learning via Visual Prompting Jul 14, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0UICrit: Enhancing Automated Design Evaluation with a UICritique Dataset Jul 11, 2024 Visual Prompting
Code Code Available 0DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement Jul 11, 2024 Object Rearrangement Visual Prompting
— Unverified 0Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge Jul 5, 2024 Instance Segmentation Optical Character Recognition (OCR)
— Unverified 0Robust Adaptation of Foundation Models with Black-Box Visual Prompting Jul 4, 2024 Transfer Learning Visual Prompting
— Unverified 0Towards Open-World Grasping with Large Vision-Language Models Jun 26, 2024 Robotic Grasping Visual Grounding
— Unverified 0RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics Jun 15, 2024 Language Modeling Language Modelling
— Unverified 0Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following Jun 6, 2024 In-Context Learning Visual Prompting
— Unverified 0Learning Visual Prompts for Guiding the Attention of Vision Transformers Jun 5, 2024 Visual Prompting
— Unverified 0