| Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos | Jan 7, 2025 | 2kLanguage Modeling | CodeCode Available | 5 | 5 |
| PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery | Jun 16, 2024 | DecoderEarth Observation | CodeCode Available | 5 | 5 |
| OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding | Jun 27, 2024 | DecoderSegmentation | CodeCode Available | 5 | 5 |
| The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation | Apr 7, 2025 | Inference OptimizationReferring Video Object Segmentation | CodeCode Available | 5 | 5 |
| SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More | Aug 8, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 5 | 5 |
| FeatUp: A Model-Agnostic Framework for Features at Any Resolution | Mar 15, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 5 | 5 |
| Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model | Jun 27, 2024 | MambaSegmentation | CodeCode Available | 5 | 5 |
| Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively | Jan 5, 2024 | image-classificationImage Classification | CodeCode Available | 5 | 5 |
| Segment Anything | Apr 5, 2023 | Event-based Object SegmentationImage Segmentation | CodeCode Available | 5 | 5 |
| Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey | Aug 23, 2024 | Image SegmentationSegmentation | CodeCode Available | 5 | 5 |