| InvPT: Inverted Pyramid Multi-task Transformer for Dense Scene Understanding | Mar 15, 2022 | Boundary DetectionHuman Parsing | CodeCode Available | 2 |
| Learning What Not to Segment: A New Perspective on Few-Shot Segmentation | Mar 15, 2022 | Few-Shot Semantic SegmentationMeta-Learning | CodeCode Available | 2 |
| Embedding Earth: Self-supervised contrastive pre-training for dense land cover classification | Mar 11, 2022 | Earth ObservationLand Cover Classification | CodeCode Available | 2 |
| A Unified Transformer Framework for Group-based Segmentation: Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection | Mar 9, 2022 | Co-Salient Object Detectionobject-detection | CodeCode Available | 2 |
| UNeXt: MLP-based Rapid Medical Image Segmentation Network | Mar 9, 2022 | DecoderImage Segmentation | CodeCode Available | 2 |
| CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers | Mar 9, 2022 | 3D Object DetectionAutonomous Vehicles | CodeCode Available | 2 |
| E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation | Mar 8, 2022 | GPUInstance Segmentation | CodeCode Available | 2 |
| ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer | Mar 8, 2022 | Image Classificationobject-detection | CodeCode Available | 2 |
| Cross Language Image Matching for Weakly Supervised Semantic Segmentation | Mar 5, 2022 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers | Mar 5, 2022 | Semantic SegmentationWeakly supervised Semantic Segmentation | CodeCode Available | 2 |
| SoftGroup for 3D Instance Segmentation on Point Clouds | Mar 3, 2022 | 3D Instance Segmentation3D Object Detection | CodeCode Available | 2 |
| A Data-scalable Transformer for Medical Image Segmentation: Architecture, Model Efficiency, and Benchmark | Feb 28, 2022 | Image SegmentationInductive Bias | CodeCode Available | 2 |
| FreeSOLO: Learning to Segment Objects without Annotations | Feb 24, 2022 | Instance Segmentationobject-detection | CodeCode Available | 2 |
| GroupViT: Semantic Segmentation Emerges from Text Supervision | Feb 22, 2022 | Object DetectionScene Understanding | CodeCode Available | 2 |
| Context Autoencoder for Self-Supervised Representation Learning | Feb 7, 2022 | DecoderInstance Segmentation | CodeCode Available | 2 |
| TransBTSV2: Towards Better and More Efficient Volumetric Segmentation of Medical Images | Jan 30, 2022 | Brain Tumor SegmentationImage Segmentation | CodeCode Available | 2 |
| Deep Video Prior for Video Consistency and Propagation | Jan 27, 2022 | Optical Flow EstimationSemantic Segmentation | CodeCode Available | 2 |
| When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism | Jan 26, 2022 | Image ClassificationObject Detection | CodeCode Available | 2 |
| UniFormer: Unifying Convolution and Self-attention for Visual Recognition | Jan 24, 2022 | Image Classificationobject-detection | CodeCode Available | 2 |
| AiTLAS: Artificial Intelligence Toolbox for Earth Observation | Jan 21, 2022 | BenchmarkingEarth Observation | CodeCode Available | 2 |
| Omnivore: A Single Model for Many Visual Modalities | Jan 20, 2022 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Language-driven Semantic Segmentation | Jan 10, 2022 | DescriptiveFew-Shot Semantic Segmentation | CodeCode Available | 2 |
| QuadTree Attention for Vision Transformers | Jan 8, 2022 | object-detectionObject Detection | CodeCode Available | 2 |
| Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images | Jan 4, 2022 | 3D Semantic SegmentationBrain Tumor Segmentation | CodeCode Available | 2 |
| Vision Transformer with Deformable Attention | Jan 3, 2022 | image-classificationImage Classification | CodeCode Available | 2 |
| Language as Queries for Referring Video Object Segmentation | Jan 3, 2022 | ObjectObject Tracking | CodeCode Available | 2 |
| C2AM: Contrastive Learning of Class-Agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation | Jan 1, 2022 | Contrastive Learningimage-classification | CodeCode Available | 2 |
| Mask2Former for Video Instance Segmentation | Dec 20, 2021 | Image SegmentationInstance Segmentation | CodeCode Available | 2 |
| Improving Image Restoration by Revisiting Global Information Aggregation | Dec 8, 2021 | Color Image DenoisingDeblurring | CodeCode Available | 2 |
| Masked-attention Mask Transformer for Universal Image Segmentation | Dec 2, 2021 | 2D Semantic SegmentationImage Segmentation | CodeCode Available | 2 |
| MetaFormer Is Actually What You Need for Vision | Nov 22, 2021 | Image ClassificationObject Detection | CodeCode Available | 2 |
| Attention Mechanisms in Computer Vision: A Survey | Nov 15, 2021 | image-classificationImage Classification | CodeCode Available | 2 |
| UNetFormer: A UNet-like Transformer for Efficient Semantic Segmentation of Remote Sensing Urban Scene Imagery | Sep 18, 2021 | Change DetectionDecoder | CodeCode Available | 2 |
| Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking | Sep 8, 2021 | BenchmarkingDiversity | CodeCode Available | 2 |
| Open-World Entity Segmentation | Jul 29, 2021 | Image ManipulationImage Segmentation | CodeCode Available | 2 |
| Per-Pixel Classification is Not All You Need for Semantic Segmentation | Jul 13, 2021 | AllClassification | CodeCode Available | 2 |
| Learning Semantic Segmentation of Large-Scale Point Clouds with Random Sampling | Jul 6, 2021 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Transformer Meets Convolution: A Bilateral Awareness Network for Semantic Segmentation of Very Fine Resolution Urban Scene Images | Jun 23, 2021 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| BEiT: BERT Pre-Training of Image Transformers | Jun 15, 2021 | Document Image ClassificationDocument Layout Analysis | CodeCode Available | 2 |
| Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations | Jun 10, 2021 | Instance Segmentationobject-detection | CodeCode Available | 2 |
| Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks | May 5, 2021 | image-classificationImage Classification | CodeCode Available | 2 |
| A Novel Transformer Based Semantic Segmentation Scheme for Fine-Resolution Remote Sensing Images | Apr 25, 2021 | DecoderSegmentation | CodeCode Available | 2 |
| Multi-Modal Fusion Transformer for End-to-End Autonomous Driving | Apr 19, 2021 | Autonomous Driving | CodeCode Available | 2 |
| Swin Transformer: Hierarchical Vision Transformer using Shifted Windows | Mar 25, 2021 | image-classificationImage Classification | CodeCode Available | 2 |
| Full Page Handwriting Recognition via Image to Sequence Extraction | Mar 11, 2021 | Handwriting RecognitionHandwritten Text Recognition | CodeCode Available | 2 |
| Coordinate Attention for Efficient Mobile Network Design | Mar 4, 2021 | object-detectionObject Detection | CodeCode Available | 2 |
| LambdaNetworks: Modeling Long-Range Interactions Without Attention | Feb 17, 2021 | image-classificationImage Classification | CodeCode Available | 2 |
| TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation | Feb 8, 2021 | Cardiac SegmentationDecoder | CodeCode Available | 2 |
| Simplifying Object Segmentation with PixelLib Library | Jan 20, 2021 | Image ClassificationInstance Segmentation | CodeCode Available | 2 |
| Boundary-Aware Segmentation Network for Mobile and Web Applications | Jan 12, 2021 | Camouflaged Object SegmentationDecoder | CodeCode Available | 2 |