| GLIPv2: Unifying Localization and Vision-Language Understanding | Jun 12, 2022 | 2D Object DetectionContrastive Learning | CodeCode Available | 4 |
| Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation | Jun 6, 2022 | Image SegmentationInstance Segmentation | CodeCode Available | 4 |
| Vision GNN: An Image is Worth Graph of Nodes | Jun 1, 2022 | Image ClassificationObject Detection | CodeCode Available | 4 |
| GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector | May 30, 2022 | Co-Salient Object DetectionObject | CodeCode Available | 4 |
| EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction | May 29, 2022 | Autonomous DrivingCPU | CodeCode Available | 4 |
| Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN | May 27, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 4 |
| BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation | May 26, 2022 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 4 |
| ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models | Apr 19, 2022 | FairnessFew-Shot Image Classification | CodeCode Available | 4 |
| PP-YOLOE: An evolved version of YOLO | Mar 30, 2022 | 2D Object DetectionDense Object Detection | CodeCode Available | 4 |
| DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection | Mar 7, 2022 | Object DetectionReal-Time Object Detection | CodeCode Available | 4 |
| DN-DETR: Accelerate DETR Training by Introducing Query DeNoising | Mar 2, 2022 | DecoderObject Detection | CodeCode Available | 4 |
| Visual Attention Network | Feb 20, 2022 | image-classificationImage Classification | CodeCode Available | 4 |
| Detectron2 Object Detection & Manipulating Images using Cartoonization | Aug 1, 2021 | Autonomous VehiclesData Visualization | CodeCode Available | 4 |
| Deep Residual Learning for Image Recognition | Dec 10, 2015 | Classification | CodeCode Available | 4 |
| Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models | Jun 10, 2025 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 |
| OrionBench: A Benchmark for Chart and Human-Recognizable Object Detection in Infographics | May 23, 2025 | Chart Understandingobject-detection | CodeCode Available | 3 |
| Detect Anything 3D in the Wild | Apr 10, 2025 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 |
| Playing Non-Embedded Card-Based Games with Reinforcement Learning | Apr 7, 2025 | Board GamesDecision Making | CodeCode Available | 3 |
| Frequency Dynamic Convolution for Dense Image Prediction | Mar 24, 2025 | object-detectionObject Detection | CodeCode Available | 3 |
| Falcon: A Remote Sensing Vision-Language Foundation Model | Mar 14, 2025 | Image Captioningimage-classification | CodeCode Available | 3 |
| Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding | Feb 14, 2025 | 3D Object Detection3D visual grounding | CodeCode Available | 3 |
| SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection | Dec 30, 2024 | object-detectionObject Detection | CodeCode Available | 3 |
| Cubify Anything: Scaling Indoor 3D Object Detection | Dec 5, 2024 | 3D Object DetectionObject | CodeCode Available | 3 |
| Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension | Nov 20, 2024 | GPUMME | CodeCode Available | 3 |
| Data Generation for Hardware-Friendly Post-Training Quantization | Oct 29, 2024 | Data AugmentationGPU | CodeCode Available | 3 |