VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Apr 10, 2025 Language Modeling Language Modelling
Code Code Available 9YOLO-World: Real-Time Open-Vocabulary Object Detection Jan 30, 2024 Instance Segmentation Language Modeling
Code Code Available 9Visual-RFT: Visual Reinforcement Fine-Tuning Mar 3, 2025 Few-Shot Object Detection Fine-Grained Image Classification
Code Code Available 7Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head Mar 11, 2024 Object Detection Open-vocabulary object detection
Code Code Available 5FG-CLIP: Fine-Grained Visual and Textual Alignment May 8, 2025 Image-text Retrieval object-detection
Code Code Available 4Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement Mar 9, 2025 Domain Generalization Object Detection
Code Code Available 4GLIPv2: Unifying Localization and Vision-Language Understanding Jun 12, 2022 2D Object Detection Contrastive Learning
Code Code Available 4OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network Sep 10, 2022 Continual Learning Object
Code Code Available 3OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer Jul 15, 2024 Language Modeling Language Modelling
Code Code Available 3Detecting Twenty-thousand Classes using Image-level Supervision Jan 7, 2022 Cross-Domain Few-Shot Object Detection image-classification
Code Code Available 3Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community Aug 17, 2024 Novel Concepts Object
Code Code Available 3Open Vocabulary Monocular 3D Object Detection Nov 25, 2024 3D Object Detection Monocular 3D Object Detection
Code Code Available 2OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation Sep 1, 2023 3D Open-Vocabulary Instance Segmentation 3D Open-Vocabulary Object Detection
Code Code Available 2Open-Vocabulary DETR with Conditional Matching Mar 22, 2022 Language Modelling object-detection
Code Code Available 2Is CLIP the main roadblock for fine-grained open-world perception? Apr 4, 2024 Autonomous Driving Novel Concepts
Code Code Available 2Detect Everything with Few Examples Sep 22, 2023 Binary Classification Cross-Domain Few-Shot Object Detection
Code Code Available 2Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection Sep 13, 2024 Mamba Open Vocabulary Object Detection
Code Code Available 2Generative Region-Language Pretraining for Open-Ended Object Detection Mar 15, 2024 Language Modeling Language Modelling
Code Code Available 2YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture Detection Feb 14, 2024 Fracture detection medical image detection
Code Code Available 2CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction Oct 2, 2023 image-classification Image Classification
Code Code Available 2SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection May 16, 2024 object-detection Object Detection
Code Code Available 2Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection Jul 7, 2022 Object Open Vocabulary Attribute Detection
Code Code Available 2PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning Nov 21, 2022 3D Classification 3D Object Detection
Code Code Available 2LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction Jul 16, 2024 Language Modeling Language Modelling
Code Code Available 2Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector Feb 5, 2024 Cross-Domain Few-Shot Cross-Domain Few-Shot Object Detection
Code Code Available 2Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization Jun 22, 2022 Causal Inference object-detection
Code Code Available 1DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection Oct 2, 2023 Novel Object Detection Object
Code Code Available 1Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation Mar 20, 2022 Knowledge Distillation Language Modelling
Code Code Available 1CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching Mar 23, 2023 Described Object Detection object-detection
Code Code Available 1Open-Vocabulary Object Detection Using Captions Nov 20, 2020 Object object-detection
Code Code Available 1Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection Dec 23, 2024 object-detection Object Detection
Code Code Available 1Distilling DETR with Visual-Linguistic Knowledge for Open-Vocabulary Object Detection Jan 1, 2023 Knowledge Distillation Language Modeling
Code Code Available 1A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection Mar 13, 2025 object-detection Object Detection
Code Code Available 1GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection Dec 22, 2023 Attribute object-detection
Code Code Available 1Open-vocabulary Attribute Detection Nov 23, 2022 Attribute Language Modeling
Code Code Available 1OvarNet: Towards Open-vocabulary Object Attribute Recognition Jan 23, 2023 Attribute Knowledge Distillation
Code Code Available 1MoCaE: Mixture of Calibrated Experts Significantly Improves Object Detection Sep 26, 2023 Instance Segmentation Mixture-of-Experts
Code Code Available 1Meta-Adapter: An Online Few-shot Learner for Vision-Language Model Nov 7, 2023 Few-Shot Learning image-classification
Code Code Available 1Multi-Modal Classifiers for Open-Vocabulary Object Detection Jun 8, 2023 Language Modelling Large Language Model
Code Code Available 1From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects Nov 27, 2024 Autonomous Driving Object
Code Code Available 1CLIM: Contrastive Language-Image Mosaic for Region Representation Dec 18, 2023 Object object-detection
Code Code Available 1LP-OVOD: Open-Vocabulary Object Detection by Linear Probing Oct 26, 2023 Object object-detection
Code Code Available 1Taming Self-Training for Open-Vocabulary Object Detection Aug 11, 2023 Object object-detection
Code Code Available 1MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection Jul 31, 2024 Language Modelling Object
Code Code Available 1Described Object Detection: Liberating Object Detection with Flexible Expressions Jul 24, 2023 Binary Classification Described Object Detection
Code Code Available 1Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection Mar 10, 2023 Object Open-vocabulary object detection
Code Code Available 1Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning Nov 20, 2023 Object object-detection
Code Code Available 1Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model Mar 28, 2022 image-classification Image Classification
Code Code Available 1CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection Oct 25, 2023 Object object-detection
Code Code Available 1Learning Object-Language Alignments for Open-Vocabulary Object Detection Nov 27, 2022 Object object-detection
Code Code Available 1