| Global Context Vision Transformers | Jun 20, 2022 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation | Apr 10, 2025 | Contrastive LearningLanguage Modeling | CodeCode Available | 2 | 5 |
| Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization | May 16, 2022 | graph partitioningSegmentation | CodeCode Available | 2 | 5 |
| GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs | May 10, 2024 | graph constructionimage-classification | CodeCode Available | 2 | 5 |
| Deep Snake for Real-Time Instance Segmentation | Jan 6, 2020 | GPUInstance Segmentation | CodeCode Available | 2 | 5 |
| GroupViT: Semantic Segmentation Emerges from Text Supervision | Feb 22, 2022 | Object DetectionScene Understanding | CodeCode Available | 2 | 5 |
| Deep Video Prior for Video Consistency and Propagation | Jan 27, 2022 | Optical Flow EstimationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation | Jan 15, 2025 | Image SegmentationReferring Expression Segmentation | CodeCode Available | 2 | 5 |
| Hier-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting | Sep 19, 2024 | Scene UnderstandingSemantic Segmentation | CodeCode Available | 2 | 5 |
| HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation | Aug 21, 2024 | Image SegmentationMamba | CodeCode Available | 2 | 5 |
| Deep Hierarchical Semantic Segmentation | Mar 27, 2022 | Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION | CodeCode Available | 2 | 5 |
| BlenderProc | Oct 25, 2019 | 3D Object RecognitionDepth Image Estimation | CodeCode Available | 2 | 5 |
| HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation | Apr 27, 2022 | Domain AdaptationGPU | CodeCode Available | 2 | 5 |
| Hulk: A Universal Knowledge Translator for Human-Centric Tasks | Dec 4, 2023 | 3D Human Pose EstimationAction Recognition | CodeCode Available | 2 | 5 |
| DeepGCNs: Making GCNs Go as Deep as CNNs | Oct 15, 2019 | 3D Point Cloud Classification3D Semantic Segmentation | CodeCode Available | 2 | 5 |
| Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding | Nov 4, 2020 | Multi-Task LearningScene Understanding | CodeCode Available | 2 | 5 |
| Understanding the Tricks of Deep Learning in Medical Image Segmentation: Challenges and Future Directions | Sep 21, 2022 | Data AugmentationDomain Adaptation | CodeCode Available | 2 | 5 |
| iFormer: Integrating ConvNet and Transformer for Mobile Application | Jan 26, 2025 | Instance Segmentationobject-detection | CodeCode Available | 2 | 5 |
| Decoupling Features in Hierarchical Propagation for Video Object Segmentation | Oct 18, 2022 | ObjectSemantic Segmentation | CodeCode Available | 2 | 5 |
| Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery | Mar 18, 2024 | Instance SegmentationNeRF | CodeCode Available | 2 | 5 |
| Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data | Mar 30, 2022 | 3D Object Detection3D Semantic Segmentation | CodeCode Available | 2 | 5 |
| Improving Nighttime Driving-Scene Segmentation via Dual Image-adaptive Learnable Filters | Jul 4, 2022 | Autonomous DrivingScene Segmentation | CodeCode Available | 2 | 5 |
| Deep Covariance Alignment for Domain Adaptive Remote Sensing Image Segmentation | Jan 9, 2024 | Image SegmentationSegmentation | CodeCode Available | 2 | 5 |
| Deep Incubation: Training Large Models by Divide-and-Conquering | Dec 8, 2022 | Image Segmentationobject-detection | CodeCode Available | 2 | 5 |
| DaViT: Dual Attention Vision Transformers | Apr 7, 2022 | Computational EfficiencyImage Classification | CodeCode Available | 2 | 5 |
| DAT++: Spatially Dynamic Vision Transformer with Deformable Attention | Sep 4, 2023 | Image ClassificationInstance Segmentation | CodeCode Available | 2 | 5 |
| Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model | Aug 8, 2022 | Aerial Scene ClassificationFew-Shot Learning | CodeCode Available | 2 | 5 |
| DDP: Diffusion Model for Dense Visual Prediction | Mar 30, 2023 | DenoisingDepth Estimation | CodeCode Available | 2 | 5 |
| DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models | Aug 11, 2023 | Dataset GenerationDecoder | CodeCode Available | 2 | 5 |
| InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene Understanding | Mar 15, 2022 | Boundary DetectionHuman Parsing | CodeCode Available | 2 | 5 |
| Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts | Jul 2, 2024 | Few-Shot Semantic SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Label Efficient Visual Abstractions for Autonomous Driving | May 20, 2020 | Autonomous DrivingSegmentation | CodeCode Available | 2 | 5 |
| Dataset Quantization | Aug 21, 2023 | Dataset Distillationobject-detection | CodeCode Available | 2 | 5 |
| An Empirical Study of Remote Sensing Pretraining | Apr 6, 2022 | Aerial Scene ClassificationBuilding change detection for remote sensing images | CodeCode Available | 2 | 5 |
| Language-driven Semantic Segmentation | Jan 10, 2022 | DescriptiveFew-Shot Semantic Segmentation | CodeCode Available | 2 | 5 |
| An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models | Nov 25, 2024 | DenoisingScene Understanding | CodeCode Available | 2 | 5 |
| LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation | Mar 12, 2024 | Image SegmentationLong-range modeling | CodeCode Available | 2 | 5 |
| LaSagnA: Language-based Segmentation Assistant for Complex Queries | Apr 12, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach | Aug 2, 2024 | cross-modal alignmentMultiple Object Tracking | CodeCode Available | 2 | 5 |
| MobileOne: An Improved One millisecond Mobile Backbone | Jun 8, 2022 | Efficient Neural NetworkGaze Estimation | CodeCode Available | 2 | 5 |
| An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | Oct 22, 2020 | image-classificationSemantic Segmentation | CodeCode Available | 2 | 5 |
| Learning What Not to Segment: A New Perspective on Few-Shot Segmentation | Mar 15, 2022 | Few-Shot Semantic SegmentationMeta-Learning | CodeCode Available | 2 | 5 |
| DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception | May 7, 2025 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| LHU-Net: A Light Hybrid U-Net for Cost-Efficient, High-Performance Volumetric Medical Image Segmentation | Apr 7, 2024 | Computational EfficiencyImage Segmentation | CodeCode Available | 2 | 5 |
| DiffAtlas: GenAI-fying Atlas Segmentation via Image-Mask Diffusion | Mar 9, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 | 5 |
| Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive | Jan 16, 2024 | Domain GeneralizationImage Generation | CodeCode Available | 2 | 5 |
| Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes | Jul 16, 2024 | Human Instance SegmentationInstance Segmentation | CodeCode Available | 2 | 5 |
| Cross Language Image Matching for Weakly Supervised Semantic Segmentation | Mar 5, 2022 | ObjectSemantic Segmentation | CodeCode Available | 2 | 5 |
| LuSNAR:A Lunar Segmentation, Navigation and Reconstruction Dataset based on Muti-sensor for Autonomous Exploration | Jul 9, 2024 | 3D ReconstructionAutonomous Navigation | CodeCode Available | 2 | 5 |
| Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images | Mar 21, 2025 | Image SegmentationMamba | CodeCode Available | 2 | 5 |