Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding Jun 9, 2023 Few-Shot Learning image-classification
Code Code Available 05 Exploring the Benefits of Visual Prompting in Differential Privacy Mar 22, 2023 image-classification Image Classification
Code Code Available 05 UICrit: Enhancing Automated Design Evaluation with a UICritique Dataset Jul 11, 2024 Visual Prompting
Code Code Available 05 Stepwise Decomposition and Dual-stream Focus: A Novel Approach for Training-free Camouflaged Object Segmentation Jun 7, 2025 Camouflaged Object Segmentation Feature Correlation
Code Code Available 05 When Does Visual Prompting Outperform Linear Probing for Vision-Language Models? A Likelihood Perspective Sep 3, 2024 Transfer Learning Visual Prompting
Code Code Available 05 Unleashing the Power of Visual Prompting At the Pixel Level Dec 20, 2022 Diversity Visual Prompting
Code Code Available 05 Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting Jun 1, 2023 Transfer Learning Visual Prompting
Code Code Available 05 ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts Dec 1, 2023 Visual Commonsense Reasoning Visual Prompting
Code Code Available 05 Uncovering the Hidden Cost of Model Compression Aug 29, 2023 model Model Compression
Code Code Available 05 Targeted Visual Prompting for Medical Visual Question Answering Aug 6, 2024 Medical Visual Question Answering Question Answering
Code Code Available 05 Towards Online Multi-Modal Social Interaction Understanding Mar 25, 2025 Visual Prompting
Code Code Available 05 Benchmarking Human and Automated Prompting in the Segment Anything Model Oct 29, 2024 Benchmarking Image Segmentation
Code Code Available 05 VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data Synthesis Mar 20, 2025 parameter-efficient fine-tuning Visual Prompting
— Unverified 00 WeatherGFM: Learning A Weather Generalist Foundation Model via In-context Learning Nov 8, 2024 In-Context Learning Question Answering
— Unverified 00 Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model Aug 1, 2024 EgoSchema Language Modeling
— Unverified 00 Zoomer: Adaptive Image Focus Optimization for Black-box MLLM Apr 30, 2025 Image Captioning Object Recognition
— Unverified 00 3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V Dec 15, 2023 3D Object Detection object-detection
— Unverified 00 3DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o Mar 17, 2025 Logical Reasoning Prompt Engineering
— Unverified 00 A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis May 29, 2025 Diagnostic Visual Prompting
— Unverified 00 Affordance-Guided Reinforcement Learning via Visual Prompting Jul 14, 2024 reinforcement-learning Reinforcement Learning
— Unverified 00 Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model May 16, 2024 Image Inpainting In-Context Learning
— Unverified 00 Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling Feb 4, 2025 Object Visual Prompting
— Unverified 00 Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models Apr 30, 2025 Hallucination Object
— Unverified 00 BLINK: Multimodal Large Language Models Can See but Not Perceive Apr 18, 2024 Depth Estimation Multiple-choice
— Unverified 00 Chat2Layout: Interactive 3D Furniture Layout with a Multimodal LLM Jul 31, 2024 In-Context Learning Layout Design
— Unverified 00 Cycle-Consistency Uncertainty Estimation for Visual Prompting based One-Shot Defect Segmentation Sep 21, 2024 Defect Detection Visual Prompting
— Unverified 00 DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement Jul 11, 2024 Object Rearrangement Visual Prompting
— Unverified 00 DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models May 29, 2025 Visual Prompting
— Unverified 00 EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery Apr 17, 2025 Large Language Model Multi-Task Learning
— Unverified 00 End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting Sep 19, 2024 Decoder Object
— Unverified 00 Explore until Confident: Efficient Exploration for Embodied Question Answering Mar 23, 2024 Conformal Prediction Efficient Exploration
— Unverified 00 Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following Jun 6, 2024 In-Context Learning Visual Prompting
— Unverified 00 Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Oct 24, 2024 Diversity Language Modeling
— Unverified 00 From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework Feb 12, 2025 Code Generation RAG
— Unverified 00 FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training Oct 10, 2022 Few-Shot Object Detection object-detection
— Unverified 00 FVP: Fourier Visual Prompting for Source-Free Unsupervised Domain Adaptation of Medical Image Segmentation Apr 26, 2023 Domain Adaptation Image Segmentation
— Unverified 00 Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering May 30, 2025 Language Modeling Language Modelling
— Unverified 00 GSON: A Group-based Social Navigation Framework with Large Multimodal Model Sep 26, 2024 Autonomous Vehicles Motion Planning
— Unverified 00 ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation Aug 2, 2023 Image Manipulation Pose Transfer
— Unverified 00 Is Temporal Prompting All We Need For Limited Labeled Action Recognition? Apr 2, 2025 Action Recognition All
— Unverified 00 KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation Mar 13, 2025 Object Visual Prompting
— Unverified 00 LaViP:Language-Grounded Visual Prompts Dec 18, 2023 Few-Shot Learning Transfer Learning
— Unverified 00 Learning Expressive Prompting With Residuals for Vision Transformers Mar 27, 2023 Few-Shot Learning image-classification
— Unverified 00 Learning Visual Prompts for Guiding the Attention of Vision Transformers Jun 5, 2024 Visual Prompting
— Unverified 00 MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention Jan 7, 2025 Classification Fine-Grained Image Classification
— Unverified 00 Medical Visual Prompting (MVP): A Unified Framework for Versatile and High-Quality Medical Image Segmentation Apr 1, 2024 Image Segmentation Medical Image Segmentation
— Unverified 00 MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models Nov 27, 2024 Person Search Visual Prompting
— Unverified 00 MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting Mar 5, 2024 In-Context Learning Object Rearrangement
— Unverified 00 MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks May 13, 2024 image-classification Image Classification
— Unverified 00 NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation Apr 20, 2025 3D Instance Segmentation 3D Open-Vocabulary Instance Segmentation
— Unverified 00