| GLaMM: Pixel Grounding Large Multimodal Model | Nov 6, 2023 | Conversational Question AnsweringImage Captioning | CodeCode Available | 2 |
| Global Context Vision Transformers | Jun 20, 2022 | image-classificationImage Classification | CodeCode Available | 2 |
| Fast Online Object Tracking and Segmentation: A Unifying Approach | Dec 12, 2018 | ObjectObject Tracking | CodeCode Available | 2 |
| GraCo: Granularity-Controllable Interactive Segmentation | May 1, 2024 | Interactive SegmentationSegmentation | CodeCode Available | 2 |
| Generative Active Learning for Long-tailed Instance Segmentation | Jun 4, 2024 | Active LearningInstance Segmentation | CodeCode Available | 2 |
| DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks | Sep 10, 2024 | Contrastive LearningImage Reconstruction | CodeCode Available | 2 |
| Hierarchical Open-vocabulary Universal Image Segmentation | Jul 3, 2023 | Image ComprehensionImage Segmentation | CodeCode Available | 2 |
| Delivering Arbitrary-Modal Semantic Segmentation | Mar 2, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery | Mar 18, 2024 | Instance SegmentationNeRF | CodeCode Available | 2 |
| Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery | Apr 3, 2025 | Field Boundary DelineationInstance Segmentation | CodeCode Available | 2 |