SOTAVerified

Zero-Shot Object Detection

Zero-shot object detection (ZSD) is the task of object detection where no visual training data is available for some of the target object classes.

( Image credit: Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts )

Papers

Showing 150 of 57 papers

TitleStatusHype
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement LearningCode4
Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving SafetyCode0
Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning0
The Power of One: A Single Example is All it Takes for Segmentation in VLMs0
LangGas: Introducing Language in Selective Zero-Shot Background Subtraction for Semi-Transparent Gas Leak Detection with a New DatasetCode1
UniFa: A unified feature hallucination framework for any-shot object detection0
CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection0
No Annotations for Object Detection in Art through Stable DiffusionCode0
Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D ObjectsCode0
DINO-X: A Unified Vision Model for Open-World Object Detection and UnderstandingCode5
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective FusionCode3
Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO0
Eating Smart: Advancing Health Informatics with the Grounding DINO based Dietary Assistant App0
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects SupervisionCode1
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object DetectionCode7
T-Rex2: Towards Generic Object Detection via Text-Visual Prompt SynergyCode7
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLMCode1
Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion HeadCode5
Synthesizing Knowledge-enhanced Features for Real-world Zero-shot Food DetectionCode0
YOLO-World: Real-Time Open-Vocabulary Object DetectionCode9
Multimodal Data Curation via Object Detection and Filter Ensembles0
SeeDS: Semantic Separable Diffusion Synthesizer for Zero-shot Food DetectionCode1
Zero-Shot Visual Classification with Guided Cropping0
ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World DataCode1
Meta-ZSDETR: Zero-shot DETR with Meta-learning0
Scaling Open-Vocabulary Object DetectionCode0
Multi-modal Queried Object Detection in the WildCode2
DoUnseen: Tuning-Free Class-Adaptive Object Detection of Unseen Objects for Robotic GraspingCode1
ZBS: Zero-shot Background Subtraction via Instance-level Background Modeling and Foreground SelectionCode1
Efficient Feature Distillation for Zero-shot Annotation Object DetectionCode0
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object DetectionCode5
Frustratingly Simple but Effective Zero-shot Detection and Segmentation: Analysis and a Strong Baseline0
Resolving Semantic Confusions for Improved Zero-Shot DetectionCode1
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary DetectionCode2
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual ModelsCode4
Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting0
On Hyperbolic Embeddings in 2D Object Detection0
From Node to Graph: Joint Reasoning on Visual-Semantic Relational Graph for Zero-Shot DetectionCode0
Robust Region Feature Synthesizer for Zero-Shot Object DetectionCode1
Grounded Language-Image Pre-trainingCode2
A Survey of Deep Learning for Low-Shot Object Detection0
Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask ArchitectureCode1
Zero-shot detection of daily objects in YCB video dataset0
Zero-shot Object Detection Through Vision-Language Embedding AlignmentCode1
Semantics-Guided Contrastive Network for Zero-Shot Object detection0
Learning Open-World Object Proposals without Learning to ClassifyCode1
Open-vocabulary Object Detection via Vision and Language Knowledge DistillationCode1
Zero-Shot Instance SegmentationCode1
Synthesizing the Unseen for Zero-shot Object DetectionCode1
Background Learnable Cascade for Zero-Shot Object DetectionCode1
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.