| DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection | Apr 3, 2024 | Autonomous Vehiclesobject-detection | CodeCode Available | 2 | 5 |
| GOReloc: Graph-based Object-Level Relocalization for Visual SLAM | Aug 15, 2024 | Objectobject-detection | CodeCode Available | 2 | 5 |
| GRiT: A Generative Region-to-text Transformer for Object Understanding | Dec 1, 2022 | DecoderDense Captioning | CodeCode Available | 2 | 5 |
| GrootVL: Tree Topology is All You Need in State Space Model | Jun 4, 2024 | Allimage-classification | CodeCode Available | 2 | 5 |
| DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data | May 16, 2024 | Data AugmentationDiversity | CodeCode Available | 2 | 5 |
| Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression | Nov 19, 2019 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving | May 19, 2022 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| GroupMamba: Efficient Group-Based Visual State Space Model | Jul 18, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| HCF-Net: Hierarchical Context Fusion Network for Infrared Small Object Detection | Mar 16, 2024 | channel selectionobject-detection | CodeCode Available | 2 | 5 |
| HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection | Dec 16, 2024 | 3D Object Detection3D Object Detection on View-of-Delft (val) | CodeCode Available | 2 | 5 |
| Hierarchical Open-vocabulary Universal Image Segmentation | Jul 3, 2023 | Image ComprehensionImage Segmentation | CodeCode Available | 2 | 5 |
| DQ-DETR: DETR with Dynamic Query for Tiny Object Detection | Apr 4, 2024 | Objectobject-detection | CodeCode Available | 2 | 5 |
| EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications | Jun 21, 2022 | Image ClassificationObject Detection | CodeCode Available | 2 | 5 |
| BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection | Jun 13, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| Hulk: A Universal Knowledge Translator for Human-Centric Tasks | Dec 4, 2023 | 3D Human Pose EstimationAction Recognition | CodeCode Available | 2 | 5 |
| Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks | May 5, 2021 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training | Mar 24, 2022 | Objectobject-detection | CodeCode Available | 2 | 5 |
| Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss | Apr 2, 2024 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction | May 30, 2022 | Exposure CorrectionImage Enhancement | CodeCode Available | 2 | 5 |
| EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection | Feb 23, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| DiffusionTrack: Diffusion Model For Multi-Object Tracking | Aug 19, 2023 | Denoisingmodel | CodeCode Available | 2 | 5 |
| InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists | Sep 30, 2023 | Depth EstimationImage Generation | CodeCode Available | 2 | 5 |
| Dilated Neighborhood Attention Transformer | Sep 29, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 2 | 5 |
| DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation | Sep 18, 2023 | 3D geometryDecoder | CodeCode Available | 2 | 5 |
| Is CLIP the main roadblock for fine-grained open-world perception? | Apr 4, 2024 | Autonomous DrivingNovel Concepts | CodeCode Available | 2 | 5 |
| Joint Perception and Prediction for Autonomous Driving: A Survey | Dec 18, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 2 | 5 |
| Binary Neural Networks: A Survey | Mar 31, 2020 | Binarizationimage-classification | CodeCode Available | 2 | 5 |
| MobileOne: An Improved One millisecond Mobile Backbone | Jun 8, 2022 | Efficient Neural NetworkGaze Estimation | CodeCode Available | 2 | 5 |
| K-Radar: 4D Radar Object Detection for Autonomous Driving in Various Weather Conditions | Jun 16, 2022 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| LambdaNetworks: Modeling Long-Range Interactions Without Attention | Feb 17, 2021 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction | Jul 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| DEYOLO: Dual-Feature-Enhancement YOLO for Cross-Modality Object Detection | Dec 6, 2024 | Objectobject-detection | CodeCode Available | 2 | 5 |
| LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection | Jan 29, 2024 | 3D Object DetectionAutonomous Vehicles | CodeCode Available | 2 | 5 |
| LinK3D: Linear Keypoints Representation for 3D LiDAR Point Cloud | Jun 13, 2022 | 3D Object Detectionobject-detection | CodeCode Available | 2 | 5 |
| DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Oct 22, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 | 5 |
| DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds | Jun 9, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 | 5 |
| DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection | Jul 21, 2022 | 3D Object Detection3D Object Detection From Monocular Images | CodeCode Available | 2 | 5 |
| DETR Does Not Need Multi-Scale or Locality Design | Jan 1, 2023 | DecoderObject Detection | CodeCode Available | 2 | 5 |
| Agent Attention: On the Integration of Softmax and Linear Attention | Dec 14, 2023 | Computational Efficiencyimage-classification | CodeCode Available | 2 | 5 |
| MambaFusion: Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection | Jul 6, 2025 | 3D Object DetectionAttribute | CodeCode Available | 2 | 5 |
| DEYO: DETR with YOLO for End-to-End Object Detection | Feb 26, 2024 | DecoderGPU | CodeCode Available | 2 | 5 |
| Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis | Jan 22, 2024 | Document Layout AnalysisDocument Summarization | CodeCode Available | 2 | 5 |
| Accelerating DETR Convergence via Semantic-Aligned Matching | Mar 14, 2022 | Objectobject-detection | CodeCode Available | 2 | 5 |
| A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation | Sep 27, 2024 | Exemplar-Free CountingFew-shot Object Counting and Detection | CodeCode Available | 2 | 5 |
| A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection | Mar 9, 2022 | Co-Salient Object Detectionobject-detection | CodeCode Available | 2 | 5 |
| DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution | Jun 3, 2020 | Instance SegmentationObject | CodeCode Available | 2 | 5 |
| Detecting Everything in the Open World: Towards Universal Object Detection | Mar 21, 2023 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| Detection in Crowded Scenes: One Proposal, Multiple Predictions | Mar 20, 2020 | Object DetectionPedestrian Detection | CodeCode Available | 2 | 5 |