| Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions | Jan 7, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 5 |
| YOLOR-Based Multi-Task Learning | Sep 29, 2023 | Image CaptioningInstance Segmentation | CodeCode Available | 5 |
| The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation | Apr 7, 2025 | Inference OptimizationReferring Video Object Segmentation | CodeCode Available | 5 |
| SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More | Aug 8, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 5 |
| Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos | Jan 7, 2025 | 2kLanguage Modeling | CodeCode Available | 5 |
| RTMDet: An Empirical Study of Designing Real-Time Object Detectors | Dec 14, 2022 | GPUInstance Segmentation | CodeCode Available | 4 |
| Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN | May 27, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 4 |
| GLIPv2: Unifying Localization and Vision-Language Understanding | Jun 12, 2022 | 2D Object DetectionContrastive Learning | CodeCode Available | 4 |
| SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree | Oct 21, 2024 | Heuristic SearchObject | CodeCode Available | 4 |
| LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model | Dec 28, 2023 | Instance SegmentationLanguage Modeling | CodeCode Available | 4 |
| OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels | Feb 27, 2025 | Image ClassificationInstance Segmentation | CodeCode Available | 4 |
| Highly Accurate Dichotomous Image Segmentation | Mar 6, 2022 | 2k3D Reconstruction | CodeCode Available | 4 |
| Panoptic Feature Pyramid Networks | Jan 8, 2019 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 4 |
| PVUW 2024 Challenge on Complex Video Understanding: Methods and Results | Jun 24, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 4 |
| Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering | Jan 12, 2024 | 3D Panoptic Segmentation3D Semantic Segmentation | CodeCode Available | 4 |
| Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications | Jan 11, 2024 | image-classificationImage Classification | CodeCode Available | 4 |
| Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation | Feb 7, 2024 | Cardiac SegmentationComputational Efficiency | CodeCode Available | 4 |
| EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything | Dec 1, 2023 | Decoderimage-classification | CodeCode Available | 4 |
| Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation | Jun 6, 2022 | Image SegmentationInstance Segmentation | CodeCode Available | 4 |
| Attention on the Sphere | May 16, 2025 | Depth EstimationImage Segmentation | CodeCode Available | 4 |
| LSKNet: A Foundation Lightweight Backbone for Remote Sensing | Mar 18, 2024 | Change Detectionobject-detection | CodeCode Available | 4 |
| Deep Residual Learning for Image Recognition | Dec 10, 2015 | Classification | CodeCode Available | 4 |
| EmbodiedSAM: Online Segment Any 3D Thing in Real Time | Aug 21, 2024 | 3D Instance SegmentationGPU | CodeCode Available | 4 |
| Detectron2 Object Detection & Manipulating Images using Cartoonization | Aug 1, 2021 | Autonomous VehiclesData Visualization | CodeCode Available | 4 |
| InstanceDiffusion: Instance-level Control for Image Generation | Feb 5, 2024 | Conditional Text-to-Image SynthesisImage Generation | CodeCode Available | 4 |