| Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks | May 5, 2021 | image-classificationImage Classification | CodeCode Available | 2 |
| Beyond Self-Attention: Deformable Large Kernel Attention for Medical Image Segmentation | Aug 31, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Neural 3D Scene Reconstruction with the Manhattan-world Assumption | May 5, 2022 | 2D Semantic Segmentation3D Reconstruction | CodeCode Available | 2 |
| SlicerNNInteractive: A 3D Slicer extension for nnInteractive | Apr 7, 2025 | Image SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Alleviating Textual Reliance in Medical Language-guided Segmentation via Prototype-driven Semantic Approximation | Jul 15, 2025 | Image SegmentationSegmentation | CodeCode Available | 2 |
| Deep Video Prior for Video Consistency and Propagation | Jan 27, 2022 | Optical Flow EstimationSemantic Segmentation | CodeCode Available | 2 |
| Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization | May 16, 2022 | graph partitioningSegmentation | CodeCode Available | 2 |
| Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery | Apr 3, 2025 | Field Boundary DelineationInstance Segmentation | CodeCode Available | 2 |
| An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training | Apr 18, 2024 | Contrastive LearningCPU | CodeCode Available | 2 |
| OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction | Apr 11, 2023 | 3D Semantic Occupancy Prediction3D Semantic Scene Completion | CodeCode Available | 2 |
| Adapter is All You Need for Tuning Visual Tasks | Nov 25, 2023 | Allimage-classification | CodeCode Available | 2 |
| BiFormer: Vision Transformer with Bi-Level Routing Attention | Mar 15, 2023 | Computational EfficiencyGPU | CodeCode Available | 2 |
| Deep Incubation: Training Large Models by Divide-and-Conquering | Dec 8, 2022 | Image Segmentationobject-detection | CodeCode Available | 2 |
| Deep Snake for Real-Time Instance Segmentation | Jan 6, 2020 | GPUInstance Segmentation | CodeCode Available | 2 |
| Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration | May 26, 2025 | Domain GeneralizationHallucination | CodeCode Available | 2 |
| Omnivore: A Single Model for Many Visual Modalities | Jan 20, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Delivering Arbitrary-Modal Semantic Segmentation | Mar 2, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation | May 28, 2024 | Instance SegmentationObject Proposal Generation | CodeCode Available | 2 |
| One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos | Sep 29, 2024 | AllImage Segmentation | CodeCode Available | 2 |
| OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies | May 8, 2024 | Domain AdaptationScene Understanding | CodeCode Available | 2 |
| Open-Set Domain Adaptation for Semantic Segmentation | May 30, 2024 | Domain AdaptationSemantic Segmentation | CodeCode Available | 2 |
| Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models | Mar 21, 2024 | Image GenerationSemantic Segmentation | CodeCode Available | 2 |
| Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models | Mar 8, 2023 | Open Vocabulary Panoptic SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 |
| BlenderProc | Oct 25, 2019 | 3D Object RecognitionDepth Image Estimation | CodeCode Available | 2 |
| Deep Covariance Alignment for Domain Adaptive Remote Sensing Image Segmentation | Jan 9, 2024 | Image SegmentationSegmentation | CodeCode Available | 2 |