Stepwise Decomposition and Dual-stream Focus: A Novel Approach for Training-free Camouflaged Object Segmentation Jun 7, 2025 Camouflaged Object Segmentation Feature Correlation
Code Code Available 0RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought Jun 4, 2025 Multimodal Reasoning Reasoning Segmentation
— Unverified 0Grid-LOGAT: Grid Based Local and Global Area Transcription for Video Question Answering May 30, 2025 Language Modeling Language Modelling
— Unverified 0DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models May 29, 2025 Visual Prompting
— Unverified 0A Comprehensive Evaluation of Multi-Modal Large Language Models for Endoscopy Analysis May 29, 2025 Diagnostic Visual Prompting
— Unverified 0VP Lab: a PEFT-Enabled Visual Prompting Laboratory for Semantic Segmentation May 21, 2025 parameter-efficient fine-tuning Semantic Segmentation
— Unverified 0Vision Graph Prompting via Semantic Low-Rank Decomposition May 7, 2025 parameter-efficient fine-tuning Visual Prompting
Code Code Available 1Token Coordinated Prompt Attention is Needed for Visual Prompting May 5, 2025 Diversity Visual Prompting
Code Code Available 1Zoomer: Adaptive Image Focus Optimization for Black-box MLLM Apr 30, 2025 Image Captioning Object Recognition
— Unverified 0Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models Apr 30, 2025 Hallucination Object
— Unverified 0RadSAM: Segmenting 3D radiological images with a 2D promptable model Apr 29, 2025 Image Segmentation Medical Image Segmentation
— Unverified 0Visual and textual prompts for enhancing emotion recognition in video Apr 24, 2025 Emotion Recognition Video Emotion Recognition
— Unverified 0NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation Apr 20, 2025 3D Instance Segmentation 3D Open-Vocabulary Instance Segmentation
— Unverified 0Visual Prompting for One-shot Controllable Video Editing without Inversion Apr 19, 2025 Video Editing Visual Prompting
— Unverified 0EarthGPT-X: Enabling MLLMs to Flexibly and Comprehensively Understand Multi-Source Remote Sensing Imagery Apr 17, 2025 Large Language Model Multi-Task Learning
— Unverified 0Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval Apr 2, 2025 Image Retrieval Retrieval
— Unverified 0Is Temporal Prompting All We Need For Limited Labeled Action Recognition? Apr 2, 2025 Action Recognition All
— Unverified 0Towards Online Multi-Modal Social Interaction Understanding Mar 25, 2025 Visual Prompting
Code Code Available 0VP-NTK: Exploring the Benefits of Visual Prompting in Differentially Private Data Synthesis Mar 20, 2025 parameter-efficient fine-tuning Visual Prompting
— Unverified 03DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o Mar 17, 2025 Logical Reasoning Prompt Engineering
— Unverified 0KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation Mar 13, 2025 Object Visual Prompting
— Unverified 0Chameleon: Fast-slow Neuro-symbolic Lane Topology Extraction Mar 10, 2025 Autonomous Driving Scene Understanding
Code Code Available 2Towards Universal Text-driven CT Image Segmentation Mar 8, 2025 Computed Tomography (CT) Contrastive Learning
Code Code Available 0Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth Ambiguity Mar 8, 2025 Depth Estimation Scene Understanding
Code Code Available 0The Role of Background Information in Reducing Object Hallucination in Vision-Language Models: Insights from Cutoff API Prompting Feb 21, 2025 Hallucination Object
— Unverified 0From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework Feb 12, 2025 Code Generation RAG
— Unverified 0Personalization Toolkit: Training Free Personalization of Large Vision Language Models Feb 4, 2025 RAG Retrieval
— Unverified 0Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling Feb 4, 2025 Object Visual Prompting
— Unverified 0LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation Feb 2, 2025 Inductive Bias Visual Prompting
Code Code Available 1IP-Prompter: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting Jan 26, 2025 Diffusion Personalization Diffusion Personalization Tuning Free
Code Code Available 0MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention Jan 7, 2025 Classification Fine-Grained Image Classification
— Unverified 0GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models Jan 2, 2025 Scene Understanding text annotation
Code Code Available 4Query Efficient Black-Box Visual Prompting with Subspace Learning Jan 1, 2025 Prompt Learning Visual Prompting
— Unverified 0Visual Prompting with Iterative Refinement for Design Critique Generation Dec 22, 2024 Attribute Visual Prompting
— Unverified 0Selective Visual Prompting in Vision Mamba Dec 12, 2024 Mamba State Space Models
Code Code Available 1Test-time Correction with Human Feedback: An Online 3D Detection System via Visual Prompting Dec 10, 2024 Autonomous Driving Visual Prompting
— Unverified 0Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning Dec 4, 2024 Multimodal Large Language Model Video Understanding
Code Code Available 1MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models Nov 27, 2024 Person Search Visual Prompting
— Unverified 0Improved GUI Grounding via Iterative Narrowing Nov 18, 2024 Language Modeling Language Modelling
Code Code Available 1Prompting the Unseen: Detecting Hidden Backdoors in Black-Box Models Nov 14, 2024 Visual Prompting
— Unverified 0WeatherGFM: Learning A Weather Generalist Foundation Model via In-context Learning Nov 8, 2024 In-Context Learning Question Answering
— Unverified 0Benchmarking Human and Automated Prompting in the Segment Anything Model Oct 29, 2024 Benchmarking Image Segmentation
Code Code Available 0Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Oct 24, 2024 Diversity Language Modeling
— Unverified 0Visual Prompting in LLMs for Enhancing Emotion Recognition Oct 3, 2024 Emotion Recognition Visual Prompting
— Unverified 0Improving Visual Object Tracking through Visual Prompting Sep 27, 2024 Object
Code Code Available 1GSON: A Group-based Social Navigation Framework with Large Multimodal Model Sep 26, 2024 Autonomous Vehicles Motion Planning
— Unverified 0Attention Prompting on Image for Large Vision-Language Models Sep 25, 2024 MM-Vet Visual Prompting
Code Code Available 2Cycle-Consistency Uncertainty Estimation for Visual Prompting based One-Shot Defect Segmentation Sep 21, 2024 Defect Detection Visual Prompting
— Unverified 0End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting Sep 19, 2024 Decoder Object
— Unverified 0Visual Prompting in Multimodal Large Language Models: A Survey Sep 5, 2024 In-Context Learning Prompt Learning
— Unverified 0