Towards Open-World Grasping with Large Vision-Language Models Jun 26, 2024 Robotic Grasping Visual Grounding
— Unverified 0Towards Robust and Accurate Visual Prompting Nov 18, 2023 Adversarial Robustness Transfer Learning
— Unverified 0T-Rex: Counting by Visual Prompting Nov 22, 2023 Object Object Counting
— Unverified 0Tumor segmentation on whole slide images: training or prompting? Feb 21, 2024 Computational Efficiency Segmentation
— Unverified 0Zoomer: Adaptive Image Focus Optimization for Black-box MLLM Apr 30, 2025 Image Captioning Object Recognition
— Unverified 03DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V Dec 15, 2023 3D Object Detection object-detection
— Unverified 03DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o Mar 17, 2025 Logical Reasoning Prompt Engineering
— Unverified 0A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis May 29, 2025 Diagnostic Visual Prompting
— Unverified 0Affordance-Guided Reinforcement Learning via Visual Prompting Jul 14, 2024 reinforcement-learning Reinforcement Learning
— Unverified 0Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model May 16, 2024 Image Inpainting In-Context Learning
— Unverified 0Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling Feb 4, 2025 Object Visual Prompting
— Unverified 0Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models Apr 30, 2025 Hallucination Object
— Unverified 0BLINK: Multimodal Large Language Models Can See but Not Perceive Apr 18, 2024 Depth Estimation Multiple-choice
— Unverified 0Chat2Layout: Interactive 3D Furniture Layout with a Multimodal LLM Jul 31, 2024 In-Context Learning Layout Design
— Unverified 0Cycle-Consistency Uncertainty Estimation for Visual Prompting based One-Shot Defect Segmentation Sep 21, 2024 Defect Detection Visual Prompting
— Unverified 0DegustaBot: Zero-Shot Visual Preference Estimation for Personalized Multi-Object Rearrangement Jul 11, 2024 Object Rearrangement Visual Prompting
— Unverified 0DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models May 29, 2025 Visual Prompting
— Unverified 0EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery Apr 17, 2025 Large Language Model Multi-Task Learning
— Unverified 0End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting Sep 19, 2024 Decoder Object
— Unverified 0Explore until Confident: Efficient Exploration for Embodied Question Answering Mar 23, 2024 Conformal Prediction Efficient Exploration
— Unverified 0Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following Jun 6, 2024 In-Context Learning Visual Prompting
— Unverified 0Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Oct 24, 2024 Diversity Language Modeling
— Unverified 0From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework Feb 12, 2025 Code Generation RAG
— Unverified 0FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training Oct 10, 2022 Few-Shot Object Detection object-detection
— Unverified 0FVP: Fourier Visual Prompting for Source-Free Unsupervised Domain Adaptation of Medical Image Segmentation Apr 26, 2023 Domain Adaptation Image Segmentation
— Unverified 0Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering May 30, 2025 Language Modeling Language Modelling
— Unverified 0GSON: A Group-based Social Navigation Framework with Large Multimodal Model Sep 26, 2024 Autonomous Vehicles Motion Planning
— Unverified 0ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation Aug 2, 2023 Image Manipulation Pose Transfer
— Unverified 0Is Temporal Prompting All We Need For Limited Labeled Action Recognition? Apr 2, 2025 Action Recognition All
— Unverified 0KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation Mar 13, 2025 Object Visual Prompting
— Unverified 0LaViP:Language-Grounded Visual Prompts Dec 18, 2023 Few-Shot Learning Transfer Learning
— Unverified 0Learning Expressive Prompting With Residuals for Vision Transformers Mar 27, 2023 Few-Shot Learning image-classification
— Unverified 0Learning Visual Prompts for Guiding the Attention of Vision Transformers Jun 5, 2024 Visual Prompting
— Unverified 0MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention Jan 7, 2025 Classification Fine-Grained Image Classification
— Unverified 0Medical Visual Prompting (MVP): A Unified Framework for Versatile and High-Quality Medical Image Segmentation Apr 1, 2024 Image Segmentation Medical Image Segmentation
— Unverified 0MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models Nov 27, 2024 Person Search Visual Prompting
— Unverified 0MOKA: Open-World Robotic Manipulation through Mark-Based Visual Prompting Mar 5, 2024 In-Context Learning Object Rearrangement
— Unverified 0MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks May 13, 2024 image-classification Image Classification
— Unverified 0NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation Apr 20, 2025 3D Instance Segmentation 3D Open-Vocabulary Instance Segmentation
— Unverified 0On the low-shot transferability of [V]-Mamba Mar 15, 2024 Few-Shot Learning Mamba
— Unverified 0Open-Set Video-based Facial Expression Recognition with Human Expression-sensitive Prompting Apr 26, 2024 Facial Expression Recognition Multi-Task Learning
— Unverified 0Personalization Toolkit: Training Free Personalization of Large Vision Language Models Feb 4, 2025 RAG Retrieval
— Unverified 0PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs Feb 12, 2024 Instruction Following Logical Reasoning
— Unverified 0Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval Apr 2, 2025 Image Retrieval Retrieval
— Unverified 0Prompting the Unseen: Detecting Hidden Backdoors in Black-Box Models Nov 14, 2024 Visual Prompting
— Unverified 0Query Efficient Black-Box Visual Prompting with Subspace Learning Jan 1, 2025 Prompt Learning Visual Prompting
— Unverified 0RadSAM: Segmenting 3D radiological images with a 2D promptable model Apr 29, 2025 Image Segmentation Medical Image Segmentation
— Unverified 0Rethinking Sparse Lexical Representations for Image Retrieval in the Age of Rising Multi-Modal Large Language Models Aug 29, 2024 Data Augmentation Image Retrieval
— Unverified 0Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge Jul 5, 2024 Instance Segmentation Optical Character Recognition (OCR)
— Unverified 0RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics Jun 15, 2024 Language Modeling Language Modelling
— Unverified 0