YOLO-World: Real-Time Open-Vocabulary Object Detection Jan 30, 2024 Instance Segmentation Language Modeling
Code Code Available 95 MambaVision: A Hybrid Mamba-Transformer Vision Backbone Jul 10, 2024 Image Classification Instance Segmentation
Code Code Available 75 MambaOut: Do We Really Need Mamba for Vision? May 13, 2024 image-classification Image Classification
Code Code Available 75 Faster Segment Anything: Towards Lightweight SAM for Mobile Applications Jun 25, 2023 CPU Decoder
Code Code Available 55 4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities Jun 13, 2024 Instance Segmentation multimodal generation
Code Code Available 55 YOLOR-Based Multi-Task Learning Sep 29, 2023 Image Captioning Instance Segmentation
Code Code Available 55 Panoptic Feature Pyramid Networks Jan 8, 2019 Instance Segmentation Panoptic Segmentation
Code Code Available 45 EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything Dec 1, 2023 Decoder image-classification
Code Code Available 45 OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels Feb 27, 2025 Image Classification Instance Segmentation
Code Code Available 45 Detectron2 Object Detection & Manipulating Images using Cartoonization Aug 1, 2021 Autonomous Vehicles Data Visualization
Code Code Available 45 Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation Jun 6, 2022 Image Segmentation Instance Segmentation
Code Code Available 45 EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction May 29, 2022 Autonomous Driving CPU
Code Code Available 45 InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions Nov 10, 2022 2D Object Detection Classification
Code Code Available 45 Visual Attention Network Feb 20, 2022 image-classification Image Classification
Code Code Available 45 RTMDet: An Empirical Study of Designing Real-Time Object Detectors Dec 14, 2022 GPU Instance Segmentation
Code Code Available 45 EmbodiedSAM: Online Segment Any 3D Thing in Real Time Aug 21, 2024 3D Instance Segmentation GPU
Code Code Available 45 InstanceDiffusion: Instance-level Control for Image Generation Feb 5, 2024 Conditional Text-to-Image Synthesis Image Generation
Code Code Available 45 LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model Dec 28, 2023 Instance Segmentation Language Modeling
Code Code Available 45 Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN May 27, 2022 Image Classification Instance Segmentation
Code Code Available 45 GLIPv2: Unifying Localization and Vision-Language Understanding Jun 12, 2022 2D Object Detection Contrastive Learning
Code Code Available 45 VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation Aug 28, 2023 Instance Segmentation Optical Flow Estimation
Code Code Available 35 A Simple Framework for Open-Vocabulary Segmentation and Detection Mar 14, 2023 Instance Segmentation Panoptic Segmentation
Code Code Available 35 Vision Transformers: From Semantic Segmentation to Dense Prediction Jul 19, 2022 image-classification Image Classification
Code Code Available 35 XCiT: Cross-Covariance Image Transformers Jun 17, 2021 image-classification Image Classification
Code Code Available 35 Vision Transformer Adapter for Dense Predictions May 17, 2022 Instance Segmentation Object Detection
Code Code Available 35 ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions Mar 13, 2024 Instance Segmentation Object Detection
Code Code Available 35 DETRs with Collaborative Hybrid Assignments Training Nov 22, 2022 Decoder Instance Segmentation
Code Code Available 35 Universal Instance Perception as Object Discovery and Retrieval Mar 12, 2023 Described Object Detection Generalized Referring Expression Comprehension
Code Code Available 35 PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition Mar 26, 2024 Image Classification Instance Segmentation
Code Code Available 35 Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation Jun 4, 2024 2D Object Detection 3D Instance Segmentation
Code Code Available 35 ResNeSt: Split-Attention Networks Apr 19, 2020 image-classification Image Classification
Code Code Available 35 Cut and Learn for Unsupervised Object Detection and Instance Segmentation Jan 26, 2023 Instance Segmentation object-detection
Code Code Available 35 No time to train! Training-Free Reference-Based Instance Segmentation Jul 3, 2025 Cross-Domain Few-Shot Object Detection Few-Shot Object Detection
Code Code Available 35 Nuclei instance segmentation and classification in histopathology images with StarDist Mar 3, 2022 Classification Instance Segmentation
Code Code Available 35 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks Aug 15, 2024 image-classification Image Classification
Code Code Available 35 InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentation Aug 28, 2024 Cell Segmentation GPU
Code Code Available 35 Generalized Decoding for Pixel, Image, and Language Dec 21, 2022 Decoder Image Segmentation
Code Code Available 35 Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language Alignment Dec 1, 2023 Contrastive Learning Few-Shot Learning
Code Code Available 35 A Survey of Camouflaged Object Detection and Beyond Aug 26, 2024 Instance Segmentation Object
Code Code Available 35 General Object Foundation Model for Images and Videos at Scale Dec 14, 2023 Instance Segmentation Long-tail Video Object Segmentation
Code Code Available 35 UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface Mar 3, 2025 Instance Segmentation Reasoning Segmentation
Code Code Available 35 MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining Mar 20, 2024 Aerial Scene Classification Building change detection for remote sensing images
Code Code Available 35 Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling Jan 9, 2023 2D Object Detection Contrastive Learning
Code Code Available 35 OneFormer: One Transformer to Rule Universal Image Segmentation Nov 10, 2022 Instance Segmentation Panoptic Segmentation
Code Code Available 35 ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning Mar 29, 2024 Continual Learning Continual Panoptic Segmentation
Code Code Available 25 ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks Oct 8, 2019 Dimensionality Reduction image-classification
Code Code Available 25 MogaNet: Multi-order Gated Aggregation Network Nov 7, 2022 3D Human Pose Estimation Image Classification
Code Code Available 25 A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting Jan 18, 2024 Instance Segmentation Interactive Segmentation
Code Code Available 25 E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation Mar 8, 2022 GPU Instance Segmentation
Code Code Available 25 Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset Jun 10, 2024 Instance Segmentation Salient Object Detection
Code Code Available 25