| EMOv2: Pushing 5M Vision Model Frontier | Dec 9, 2024 | Image Generationmodel | CodeCode Available | 2 |
| Dilated Neighborhood Attention Transformer | Sep 29, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 2 |
| DiffusionTrack: Diffusion Model For Multi-Object Tracking | Aug 19, 2023 | Denoisingmodel | CodeCode Available | 2 |
| DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Oct 22, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 |
| A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future | Jul 18, 2023 | Knowledge Distillationobject-detection | CodeCode Available | 2 |
| GroupMamba: Efficient Group-Based Visual State Space Model | Jul 18, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation | Sep 18, 2023 | 3D geometryDecoder | CodeCode Available | 2 |
| HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras | Apr 3, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection | Dec 16, 2024 | 3D Object Detection3D Object Detection on View-of-Delft (val) | CodeCode Available | 2 |
| DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection | Jul 21, 2022 | 3D Object Detection3D Object Detection From Monocular Images | CodeCode Available | 2 |
| HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions | Jul 28, 2022 | Image ClassificationObject Detection | CodeCode Available | 2 |
| DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds | Jun 9, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| HybridNets: End-to-End Perception Network | Mar 17, 2022 | Autonomous DrivingDrivable Area Detection | CodeCode Available | 2 |
| ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection | Aug 15, 2023 | Multispectral Object Detectionobject-detection | CodeCode Available | 2 |
| ICDAR 2021 Competition on Scientific Literature Parsing | Jun 8, 2021 | document understandingobject-detection | CodeCode Available | 2 |
| DEYO: DETR with YOLO for End-to-End Object Detection | Feb 26, 2024 | DecoderGPU | CodeCode Available | 2 |
| DEYOLO: Dual-Feature-Enhancement YOLO for Cross-Modality Object Detection | Dec 6, 2024 | Objectobject-detection | CodeCode Available | 2 |
| A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection | Mar 9, 2022 | Co-Salient Object Detectionobject-detection | CodeCode Available | 2 |
| DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution | Jun 3, 2020 | Instance SegmentationObject | CodeCode Available | 2 |
| Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis | Jan 22, 2024 | Document Layout AnalysisDocument Summarization | CodeCode Available | 2 |
| Is CLIP the main roadblock for fine-grained open-world perception? | Apr 4, 2024 | Autonomous DrivingNovel Concepts | CodeCode Available | 2 |
| Joint Perception and Prediction for Autonomous Driving: A Survey | Dec 18, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 2 |
| K-LITE: Learning Transferable Visual Models with External Knowledge | Apr 20, 2022 | BenchmarkingDescriptive | CodeCode Available | 2 |
| Detecting Everything in the Open World: Towards Universal Object Detection | Mar 21, 2023 | object-detectionObject Detection | CodeCode Available | 2 |