| VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning | May 17, 2025 | 2D Object DetectionObject Counting | CodeCode Available | 4 |
| Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety | Apr 18, 2025 | Anomaly DetectionAutonomous Driving | CodeCode Available | 0 |
| Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning | Apr 4, 2025 | Data AugmentationHuman Detection | —Unverified | 0 |
| The Power of One: A Single Example is All it Takes for Segmentation in VLMs | Mar 13, 2025 | Allobject-detection | —Unverified | 0 |
| LangGas: Introducing Language in Selective Zero-Shot Background Subtraction for Semi-Transparent Gas Leak Detection with a New Dataset | Mar 4, 2025 | Classificationobject-detection | CodeCode Available | 1 |
| UniFa: A unified feature hallucination framework for any-shot object detection | Mar 1, 2025 | Generalized Zero-Shot Object DetectionHallucination | —Unverified | 0 |
| CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection | Dec 13, 2024 | object-detectionObject Detection | —Unverified | 0 |
| No Annotations for Object Detection in Art through Stable Diffusion | Dec 9, 2024 | Objectobject-detection | CodeCode Available | 0 |
| Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects | Dec 3, 2024 | Autonomous Drivingobject-detection | CodeCode Available | 0 |
| DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Nov 21, 2024 | Long-tailed Object DetectionObject | CodeCode Available | 5 |
| OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion | Jul 10, 2024 | Object DetectionZero-Shot Object Detection | CodeCode Available | 3 |
| Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO | Jun 27, 2024 | Image SegmentationMedical Image Segmentation | —Unverified | 0 |
| Eating Smart: Advancing Health Informatics with the Grounding DINO based Dietary Assistant App | Jun 2, 2024 | ManagementNutrition | —Unverified | 0 |
| OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision | May 28, 2024 | Contrastive LearningDenoising | CodeCode Available | 1 |
| Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection | May 16, 2024 | Edge-computingFew-Shot Object Detection | CodeCode Available | 7 |
| T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy | Mar 21, 2024 | Contrastive LearningDescriptive | CodeCode Available | 7 |
| DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM | Mar 19, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head | Mar 11, 2024 | Object DetectionOpen-vocabulary object detection | CodeCode Available | 5 |
| Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food Detection | Feb 14, 2024 | AttributeGeneralized Zero-Shot Object Detection | CodeCode Available | 0 |
| YOLO-World: Real-Time Open-Vocabulary Object Detection | Jan 30, 2024 | Instance SegmentationLanguage Modeling | CodeCode Available | 9 |
| Multimodal Data Curation via Object Detection and Filter Ensembles | Jan 5, 2024 | Objectobject-detection | —Unverified | 0 |
| SeeDS: Semantic Separable Diffusion Synthesizer for Zero-shot Food Detection | Oct 7, 2023 | DenoisingFood recommendation | CodeCode Available | 1 |
| Zero-Shot Visual Classification with Guided Cropping | Sep 12, 2023 | ClassificationObject | —Unverified | 0 |
| ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data | Aug 22, 2023 | Attributeobject-detection | CodeCode Available | 1 |
| Meta-ZSDETR: Zero-shot DETR with Meta-learning | Aug 18, 2023 | Contrastive LearningMeta-Learning | —Unverified | 0 |