| DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| DiffAtlas: GenAI-fying Atlas Segmentation via Image-Mask Diffusion | Mar 9, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 | 5 |
| Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields | Dec 6, 2023 | 3DGS3D scene Editing | CodeCode Available | 2 | 5 |
| Feature Pyramid Networks for Object Detection | Dec 9, 2016 | GPUObject | CodeCode Available | 2 | 5 |
| DiffRect: Latent Diffusion Label Rectification for Semi-supervised Medical Image Segmentation | Jul 13, 2024 | DenoisingImage Segmentation | CodeCode Available | 2 | 5 |
| Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation | Mar 5, 2025 | ObjectReferring Video Object Segmentation | CodeCode Available | 2 | 5 |
| FM-Fusion: Instance-aware Semantic Mapping Boosted by Vision-Language Foundation Models | Feb 7, 2024 | Instance SegmentationObject | CodeCode Available | 2 | 5 |
| FocalClick: Towards Practical Interactive Image Segmentation | Apr 6, 2022 | Image SegmentationInteractive Segmentation | CodeCode Available | 2 | 5 |
| DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation | Sep 18, 2023 | 3D geometryDecoder | CodeCode Available | 2 | 5 |
| Asymmetric Non-local Neural Networks for Semantic Segmentation | Aug 21, 2019 | GPUSegmentation | CodeCode Available | 2 | 5 |
| Frequency-Adaptive Dilated Convolution for Semantic Segmentation | Mar 8, 2024 | object-detectionObject Detection | CodeCode Available | 2 | 5 |
| A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence | May 24, 2023 | Dense Pixel Correspondence EstimationRepresentation Learning | CodeCode Available | 2 | 5 |
| RevSAM2: Prompt SAM2 for Medical Image Segmentation via Reverse-Propagation without Fine-tuning | Sep 6, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 | 5 |
| Full Page Handwriting Recognition via Image to Sequence Extraction | Mar 11, 2021 | Handwriting RecognitionHandwritten Text Recognition | CodeCode Available | 2 | 5 |
| ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding | Oct 17, 2024 | 3D Semantic SegmentationImage Generation | CodeCode Available | 2 | 5 |
| GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer | Jun 3, 2024 | 3D Object DetectionImage-to-Image Translation | CodeCode Available | 2 | 5 |
| Generative Active Learning for Long-tailed Instance Segmentation | Jun 4, 2024 | Active LearningInstance Segmentation | CodeCode Available | 2 | 5 |
| Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes | Aug 30, 2024 | Deep LearningImage Segmentation | CodeCode Available | 2 | 5 |
| DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution | Jun 3, 2020 | Instance SegmentationObject | CodeCode Available | 2 | 5 |
| GLaMM: Pixel Grounding Large Multimodal Model | Nov 6, 2023 | Conversational Question AnsweringImage Captioning | CodeCode Available | 2 | 5 |
| Global Context Vision Transformers | Jun 20, 2022 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation | Apr 10, 2025 | Contrastive LearningLanguage Modeling | CodeCode Available | 2 | 5 |
| Golden Cudgel Network for Real-Time Semantic Segmentation | Mar 5, 2025 | Real-Time Semantic SegmentationSemantic Segmentation | CodeCode Available | 2 | 5 |
| GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs | May 10, 2024 | graph constructionimage-classification | CodeCode Available | 2 | 5 |
| Digital Twin Generation from Visual Data: A Survey | Apr 17, 2025 | Semantic SegmentationSurvey | CodeCode Available | 2 | 5 |