| VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning | May 17, 2025 | 2D Object DetectionObject Counting | CodeCode Available | 4 |
| Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety | Apr 18, 2025 | Anomaly DetectionAutonomous Driving | CodeCode Available | 0 |
| Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning | Apr 4, 2025 | Data AugmentationHuman Detection | —Unverified | 0 |
| The Power of One: A Single Example is All it Takes for Segmentation in VLMs | Mar 13, 2025 | Allobject-detection | —Unverified | 0 |
| LangGas: Introducing Language in Selective Zero-Shot Background Subtraction for Semi-Transparent Gas Leak Detection with a New Dataset | Mar 4, 2025 | Classificationobject-detection | CodeCode Available | 1 |
| UniFa: A unified feature hallucination framework for any-shot object detection | Mar 1, 2025 | Generalized Zero-Shot Object DetectionHallucination | —Unverified | 0 |
| CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection | Dec 13, 2024 | object-detectionObject Detection | —Unverified | 0 |
| No Annotations for Object Detection in Art through Stable Diffusion | Dec 9, 2024 | Objectobject-detection | CodeCode Available | 0 |
| Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects | Dec 3, 2024 | Autonomous Drivingobject-detection | CodeCode Available | 0 |
| DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Nov 21, 2024 | Long-tailed Object DetectionObject | CodeCode Available | 5 |
| OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion | Jul 10, 2024 | Object DetectionZero-Shot Object Detection | CodeCode Available | 3 |
| Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO | Jun 27, 2024 | Image SegmentationMedical Image Segmentation | —Unverified | 0 |
| Eating Smart: Advancing Health Informatics with the Grounding DINO based Dietary Assistant App | Jun 2, 2024 | ManagementNutrition | —Unverified | 0 |
| OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision | May 28, 2024 | Contrastive LearningDenoising | CodeCode Available | 1 |
| Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection | May 16, 2024 | Edge-computingFew-Shot Object Detection | CodeCode Available | 7 |
| T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy | Mar 21, 2024 | Contrastive LearningDescriptive | CodeCode Available | 7 |
| DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM | Mar 19, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head | Mar 11, 2024 | Object DetectionOpen-vocabulary object detection | CodeCode Available | 5 |
| Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food Detection | Feb 14, 2024 | AttributeGeneralized Zero-Shot Object Detection | CodeCode Available | 0 |
| YOLO-World: Real-Time Open-Vocabulary Object Detection | Jan 30, 2024 | Instance SegmentationLanguage Modeling | CodeCode Available | 9 |
| Multimodal Data Curation via Object Detection and Filter Ensembles | Jan 5, 2024 | Objectobject-detection | —Unverified | 0 |
| SeeDS: Semantic Separable Diffusion Synthesizer for Zero-shot Food Detection | Oct 7, 2023 | DenoisingFood recommendation | CodeCode Available | 1 |
| Zero-Shot Visual Classification with Guided Cropping | Sep 12, 2023 | ClassificationObject | —Unverified | 0 |
| ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data | Aug 22, 2023 | Attributeobject-detection | CodeCode Available | 1 |
| Meta-ZSDETR: Zero-shot DETR with Meta-learning | Aug 18, 2023 | Contrastive LearningMeta-Learning | —Unverified | 0 |
| Scaling Open-Vocabulary Object Detection | Jun 16, 2023 | image-classificationImage Classification | CodeCode Available | 0 |
| Multi-modal Queried Object Detection in the Wild | May 30, 2023 | Few-Shot Object DetectionObject | CodeCode Available | 2 |
| DoUnseen: Tuning-Free Class-Adaptive Object Detection of Unseen Objects for Robotic Grasping | Apr 6, 2023 | Objectobject-detection | CodeCode Available | 1 |
| ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground Selection | Mar 26, 2023 | Foreground SegmentationObject | CodeCode Available | 1 |
| Efficient Feature Distillation for Zero-shot Annotation Object Detection | Mar 21, 2023 | Objectobject-detection | CodeCode Available | 0 |
| Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection | Mar 9, 2023 | DecoderObject Detection | CodeCode Available | 5 |
| Frustratingly Simple but Effective Zero-shot Detection and Segmentation: Analysis and a Strong Baseline | Feb 14, 2023 | Objectobject-detection | —Unverified | 0 |
| Resolving Semantic Confusions for Improved Zero-Shot Detection | Dec 12, 2022 | Generalized Zero-Shot Object DetectionObject Detection | CodeCode Available | 1 |
| Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection | Jul 7, 2022 | ObjectOpen Vocabulary Attribute Detection | CodeCode Available | 2 |
| ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models | Apr 19, 2022 | FairnessFew-Shot Image Classification | CodeCode Available | 4 |
| Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting | Apr 16, 2022 | Few-Shot LearningFew-Shot Object Detection | —Unverified | 0 |
| On Hyperbolic Embeddings in 2D Object Detection | Mar 15, 2022 | 2D Object DetectionClassification | —Unverified | 0 |
| From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot Detection | Feb 15, 2022 | Generalized Zero-Shot Object DetectionScene Understanding | CodeCode Available | 0 |
| Robust Region Feature Synthesizer for Zero-Shot Object Detection | Jan 1, 2022 | Generalized Zero-Shot Object DetectionObject | CodeCode Available | 1 |
| Grounded Language-Image Pre-training | Dec 7, 2021 | 2D Object DetectionDescribed Object Detection | CodeCode Available | 2 |
| A Survey of Deep Learning for Low-Shot Object Detection | Dec 6, 2021 | Deep LearningFew-Shot Learning | —Unverified | 0 |
| Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture | Nov 22, 2021 | Handwritten Text Recognitionobject-detection | CodeCode Available | 1 |
| Zero-shot detection of daily objects in YCB video dataset | Sep 29, 2021 | Objectobject-detection | —Unverified | 0 |
| Zero-shot Object Detection Through Vision-Language Embedding Alignment | Sep 24, 2021 | Objectobject-detection | CodeCode Available | 1 |
| Semantics-Guided Contrastive Network for Zero-Shot Object detection | Sep 4, 2021 | Contrastive LearningGeneralized Zero-Shot Object Detection | —Unverified | 0 |
| Learning Open-World Object Proposals without Learning to Classify | Aug 15, 2021 | Objectobject-detection | CodeCode Available | 1 |
| Open-vocabulary Object Detection via Vision and Language Knowledge Distillation | Apr 28, 2021 | image-classificationImage Classification | CodeCode Available | 1 |
| Zero-Shot Instance Segmentation | Apr 14, 2021 | Instance Segmentationobject-detection | CodeCode Available | 1 |
| Synthesizing the Unseen for Zero-shot Object Detection | Oct 19, 2020 | DiversityGeneralized Zero-Shot Object Detection | CodeCode Available | 1 |
| Background Learnable Cascade for Zero-Shot Object Detection | Oct 9, 2020 | Generalized Zero-Shot Object DetectionObject | CodeCode Available | 1 |