SOTAVerified

Zero-Shot Object Detection

Zero-shot object detection (ZSD) is the task of object detection where no visual training data is available for some of the target object classes.

( Image credit: Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts )

Papers

Showing 150 of 57 papers

TitleStatusHype
YOLO-World: Real-Time Open-Vocabulary Object DetectionCode9
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object DetectionCode7
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt SynergyCode7
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object DetectionCode5
Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion HeadCode5
DINO-X: A Unified Vision Model for Open-World Object Detection and UnderstandingCode5
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual ModelsCode4
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement LearningCode4
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective FusionCode3
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary DetectionCode2
Multi-modal Queried Object Detection in the WildCode2
Grounded Language-Image Pre-trainingCode2
Synthesizing the Unseen for Zero-shot Object DetectionCode1
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLMCode1
ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World DataCode1
ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground SelectionCode1
Open-vocabulary Object Detection via Vision and Language Knowledge DistillationCode1
Zero-Shot Instance SegmentationCode1
Zero-Shot Object Detection by Hybrid Region EmbeddingCode1
Zero-shot Object Detection Through Vision-Language Embedding AlignmentCode1
LangGas: Introducing Language in Selective Zero-Shot Background Subtraction for Semi-Transparent Gas Leak Detection with a New DatasetCode1
Learning Open-World Object Proposals without Learning to ClassifyCode1
Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask ArchitectureCode1
DoUnseen: Tuning-Free Class-Adaptive Object Detection of Unseen Objects for Robotic GraspingCode1
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects SupervisionCode1
Polarity Loss for Zero-shot Object DetectionCode1
Resolving Semantic Confusions for Improved Zero-Shot DetectionCode1
Robust Region Feature Synthesizer for Zero-Shot Object DetectionCode1
SeeDS: Semantic Separable Diffusion Synthesizer for Zero-shot Food DetectionCode1
Background Learnable Cascade for Zero-Shot Object DetectionCode1
Scaling Open-Vocabulary Object DetectionCode0
Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving SafetyCode0
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot DetectionCode0
Efficient Feature Distillation for Zero-shot Annotation Object DetectionCode0
GTNet: Generative Transfer Network for Zero-Shot Object DetectionCode0
No Annotations for Object Detection in Art through Stable DiffusionCode0
Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel ConceptsCode0
Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D ObjectsCode0
Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food DetectionCode0
Zero-shot detection of daily objects in YCB video dataset0
CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection0
Eating Smart: Advancing Health Informatics with the Grounding DINO based Dietary Assistant App0
Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning0
Frustratingly Simple but Effective Zero-shot Detection and Segmentation: Analysis and a Strong Baseline0
Image Captioning with Unseen Objects0
Meta-ZSDETR: Zero-shot DETR with Meta-learning0
Multimodal Data Curation via Object Detection and Filter Ensembles0
Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting0
On Hyperbolic Embeddings in 2D Object Detection0
Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO0
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.