| SAM 2: Segment Anything in Images and Videos | Aug 1, 2024 | Image SegmentationRobot Manipulation Generalization | CodeCode Available | 12 |
| ViLLa: Video Reasoning Segmentation with Large Language Model | Jul 18, 2024 | Image SegmentationLanguage Modeling | CodeCode Available | 1 |
| FoodMem: Near Real-time and Precise Food Video Segmentation | Jul 16, 2024 | SegmentationSemantic Segmentation | —Unverified | 0 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| General and Task-Oriented Video Segmentation | Jul 9, 2024 | DisentanglementDiversity | CodeCode Available | 1 |
| Uni-DVPS: Unified Model for Depth-Aware Video Panoptic Segmentation | Jul 1, 2024 | Autonomous DrivingDecoder | CodeCode Available | 1 |
| DaBiT: Depth and Blur informed Transformer for Joint Refocusing and Super-Resolution | Jul 1, 2024 | DeblurringSuper-Resolution | CodeCode Available | 0 |
| Deep Unfolding-Aided Parameter Tuning for Plug-and-Play-Based Video Snapshot Compressive Imaging | Jun 28, 2024 | DenoisingVideo Segmentation | —Unverified | 0 |
| MissionGNN: Hierarchical Multimodal GNN-based Weakly Supervised Video Anomaly Recognition with Mission-Specific Knowledge Graph Generation | Jun 27, 2024 | Anomaly DetectionGraph Generation | —Unverified | 0 |
| PVUW 2024 Challenge on Complex Video Understanding: Methods and Results | Jun 24, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 4 |