| Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks | May 5, 2021 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| UniFormer: Unifying Convolution and Self-attention for Visual Recognition | Jan 24, 2022 | Image Classificationobject-detection | CodeCode Available | 2 | 5 |
| Efficient Multi-Scale Attention Module with Cross-Spatial Learning | May 23, 2023 | Dimensionality Reductionimage-classification | CodeCode Available | 2 | 5 |
| Universal Guidance for Diffusion Models | Feb 14, 2023 | Face Recognitionobject-detection | CodeCode Available | 2 | 5 |
| BiFormer: Vision Transformer with Bi-Level Routing Attention | Mar 15, 2023 | Computational EfficiencyGPU | CodeCode Available | 2 | 5 |
| Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection | Apr 6, 2022 | Instance SegmentationObject | CodeCode Available | 2 | 5 |
| 3D Object Detection for Autonomous Driving: A Comprehensive Survey | Jun 19, 2022 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion | Nov 13, 2024 | 3D Object DetectionDenoising | CodeCode Available | 2 | 5 |
| VEViD: Vision Enhancement via Virtual diffraction and coherent Detection | Aug 25, 2022 | 4kImage Enhancement | CodeCode Available | 2 | 5 |
| UNetFormer: A UNet-like Transformer for Efficient Semantic Segmentation of Remote Sensing Urban Scene Imagery | Sep 18, 2021 | Change DetectionDecoder | CodeCode Available | 2 | 5 |
| ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks | Oct 8, 2019 | Dimensionality Reductionimage-classification | CodeCode Available | 2 | 5 |
| Dynamic Graph Induced Contour-aware Heat Conduction Network for Event-based Object Detection | May 19, 2025 | Event-based visionObject | CodeCode Available | 2 | 5 |
| ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer | Mar 8, 2022 | Image Classificationobject-detection | CodeCode Available | 2 | 5 |
| DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection | Apr 3, 2024 | Autonomous Vehiclesobject-detection | CodeCode Available | 2 | 5 |
| DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data | May 16, 2024 | Data AugmentationDiversity | CodeCode Available | 2 | 5 |
| Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression | Nov 19, 2019 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| DQ-DETR: DETR with Dynamic Query for Tiny Object Detection | Apr 4, 2024 | Objectobject-detection | CodeCode Available | 2 | 5 |
| EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications | Jun 21, 2022 | Image ClassificationObject Detection | CodeCode Available | 2 | 5 |
| Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details | Feb 1, 2021 | Benchmarkingobject-detection | CodeCode Available | 2 | 5 |
| DiffusionTrack: Diffusion Model For Multi-Object Tracking | Aug 19, 2023 | Denoisingmodel | CodeCode Available | 2 | 5 |
| DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| Dilated Neighborhood Attention Transformer | Sep 29, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 2 | 5 |
| DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection | Jul 21, 2022 | 3D Object Detection3D Object Detection From Monocular Images | CodeCode Available | 2 | 5 |
| DetGPT: Detect What You Need via Reasoning | May 23, 2023 | Autonomous DrivingObject | CodeCode Available | 2 | 5 |
| DEYO: DETR with YOLO for End-to-End Object Detection | Feb 26, 2024 | DecoderGPU | CodeCode Available | 2 | 5 |
| DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Oct 22, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 | 5 |
| Aligning and Prompting Everything All at Once for Universal Visual Perception | Dec 4, 2023 | AllObject | CodeCode Available | 2 | 5 |
| Detection in Crowded Scenes: One Proposal, Multiple Predictions | Mar 20, 2020 | Object DetectionPedestrian Detection | CodeCode Available | 2 | 5 |
| Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis | Jan 22, 2024 | Document Layout AnalysisDocument Summarization | CodeCode Available | 2 | 5 |
| Detect Everything with Few Examples | Sep 22, 2023 | Binary ClassificationCross-Domain Few-Shot Object Detection | CodeCode Available | 2 | 5 |
| Detecting Everything in the Open World: Towards Universal Object Detection | Mar 21, 2023 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution | Jun 3, 2020 | Instance SegmentationObject | CodeCode Available | 2 | 5 |
| DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | Mar 28, 2024 | Fine-Grained Image ClassificationImage Classification | CodeCode Available | 2 | 5 |
| Dense Distinct Query for End-to-End Object Detection | Mar 22, 2023 | Objectobject-detection | CodeCode Available | 2 | 5 |
| DeepInteraction: 3D Object Detection via Modality Interaction | Aug 23, 2022 | 3D Object DetectionDecoder | CodeCode Available | 2 | 5 |
| DETR Does Not Need Multi-Scale or Locality Design | Jan 1, 2023 | DecoderObject Detection | CodeCode Available | 2 | 5 |
| Decoupled Knowledge Distillation | Mar 16, 2022 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds | Jun 9, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 | 5 |
| DEYOLO: Dual-Feature-Enhancement YOLO for Cross-Modality Object Detection | Dec 6, 2024 | Objectobject-detection | CodeCode Available | 2 | 5 |
| DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation | Sep 18, 2023 | 3D geometryDecoder | CodeCode Available | 2 | 5 |
| Agent Attention: On the Integration of Softmax and Linear Attention | Dec 14, 2023 | Computational Efficiencyimage-classification | CodeCode Available | 2 | 5 |
| ALBench: A Framework for Evaluating Active Learning in Object Detection | Jul 27, 2022 | Active Learningimage-classification | CodeCode Available | 2 | 5 |
| DaViT: Dual Attention Vision Transformers | Apr 7, 2022 | Computational EfficiencyImage Classification | CodeCode Available | 2 | 5 |
| DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception | May 7, 2025 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| Deep Incubation: Training Large Models by Divide-and-Conquering | Dec 8, 2022 | Image Segmentationobject-detection | CodeCode Available | 2 | 5 |
| Bottleneck Transformers for Visual Recognition | Jan 27, 2021 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond | May 23, 2024 | 3D Object Detectionobject-detection | CodeCode Available | 2 | 5 |
| DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets | Jan 15, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 2 | 5 |
| EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection | Mar 31, 2023 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 | 5 |
| Dataset Quantization | Aug 21, 2023 | Dataset Distillationobject-detection | CodeCode Available | 2 | 5 |