VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Apr 10, 2025 Language Modeling Language Modelling
Code Code Available 9YOLO-World: Real-Time Open-Vocabulary Object Detection Jan 30, 2024 Instance Segmentation Language Modeling
Code Code Available 9Visual-RFT: Visual Reinforcement Fine-Tuning Mar 3, 2025 Few-Shot Object Detection Fine-Grained Image Classification
Code Code Available 7Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head Mar 11, 2024 Object Detection Open-vocabulary object detection
Code Code Available 5FG-CLIP: Fine-Grained Visual and Textual Alignment May 8, 2025 Image-text Retrieval object-detection
Code Code Available 4Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement Mar 9, 2025 Domain Generalization Object Detection
Code Code Available 4GLIPv2: Unifying Localization and Vision-Language Understanding Jun 12, 2022 2D Object Detection Contrastive Learning
Code Code Available 4Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community Aug 17, 2024 Novel Concepts Object
Code Code Available 3OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer Jul 15, 2024 Language Modeling Language Modelling
Code Code Available 3OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network Sep 10, 2022 Continual Learning Object
Code Code Available 3Detecting Twenty-thousand Classes using Image-level Supervision Jan 7, 2022 Cross-Domain Few-Shot Object Detection image-classification
Code Code Available 3Open Vocabulary Monocular 3D Object Detection Nov 25, 2024 3D Object Detection Monocular 3D Object Detection
Code Code Available 2Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection Sep 13, 2024 Mamba Open Vocabulary Object Detection
Code Code Available 2LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction Jul 16, 2024 Language Modeling Language Modelling
Code Code Available 2SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection May 16, 2024 object-detection Object Detection
Code Code Available 2Is CLIP the main roadblock for fine-grained open-world perception? Apr 4, 2024 Autonomous Driving Novel Concepts
Code Code Available 2Generative Region-Language Pretraining for Open-Ended Object Detection Mar 15, 2024 Language Modeling Language Modelling
Code Code Available 2YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture Detection Feb 14, 2024 Fracture detection medical image detection
Code Code Available 2Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector Feb 5, 2024 Cross-Domain Few-Shot Cross-Domain Few-Shot Object Detection
Code Code Available 2CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction Oct 2, 2023 image-classification Image Classification
Code Code Available 2Detect Everything with Few Examples Sep 22, 2023 Binary Classification Cross-Domain Few-Shot Object Detection
Code Code Available 2OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation Sep 1, 2023 3D Open-Vocabulary Instance Segmentation 3D Open-Vocabulary Object Detection
Code Code Available 2PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning Nov 21, 2022 3D Classification 3D Object Detection
Code Code Available 2Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection Jul 7, 2022 Object Open Vocabulary Attribute Detection
Code Code Available 2Open-Vocabulary DETR with Conditional Matching Mar 22, 2022 Language Modelling object-detection
Code Code Available 2Superpowering Open-Vocabulary Object Detectors for X-ray Vision Mar 21, 2025 object-detection Object Detection
Code Code Available 1A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection Mar 13, 2025 object-detection Object Detection
Code Code Available 1OW-OVD: Unified Open World and Open Vocabulary Object Detection Jan 1, 2025 Attribute Incremental Learning
Code Code Available 1Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection Dec 23, 2024 object-detection Object Detection
Code Code Available 1From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects Nov 27, 2024 Autonomous Driving Object
Code Code Available 1OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking Oct 23, 2024 Multi-Object Tracking Object
Code Code Available 1SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection Oct 8, 2024 object-detection Object Detection
Code Code Available 1Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussian Aug 7, 2024 Autonomous Driving object-detection
Code Code Available 1MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection Jul 31, 2024 Language Modelling Object
Code Code Available 1DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training Jul 12, 2024 Image Generation Object
Code Code Available 1OVMR: Open-Vocabulary Recognition with Multi-Modal References Jun 7, 2024 Open Vocabulary Object Detection
Code Code Available 1RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection May 30, 2024 Image Captioning Image Inpainting
Code Code Available 1OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision May 28, 2024 Contrastive Learning Denoising
Code Code Available 1The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models Apr 18, 2024 Instance Segmentation Object
Code Code Available 1Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation Apr 12, 2024 Object object-detection
Code Code Available 1Retrieval-Augmented Open-Vocabulary Object Detection Apr 8, 2024 Language Modeling Language Modelling
Code Code Available 1VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation Mar 19, 2024 Anomaly Detection object-detection
Code Code Available 1GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection Dec 22, 2023 Attribute object-detection
Code Code Available 1CLIM: Contrastive Language-Image Mosaic for Region Representation Dec 18, 2023 Object object-detection
Code Code Available 1Simple Image-level Classification Improves Open-vocabulary Object Detection Dec 16, 2023 Knowledge Distillation Object
Code Code Available 1ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection Dec 12, 2023 object-detection Object Detection
Code Code Available 1The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding Nov 29, 2023 Object object-detection
Code Code Available 1Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning Nov 20, 2023 Object object-detection
Code Code Available 1Enhancing Novel Object Detection via Cooperative Foundational Models Nov 19, 2023 Novel Class Discovery Novel Object Detection
Code Code Available 1Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention Nov 18, 2023 Concept Alignment Graph Generation
Code Code Available 1