| VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning | Jan 12, 2025 | Dense Video CaptioningVideo Captioning | CodeCode Available | 1 |
| Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation | Jan 12, 2025 | Image RetrievalImage Segmentation | —Unverified | 0 |
| Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos | Jan 7, 2025 | 2kLanguage Modeling | CodeCode Available | 5 |
| Segment Anything Model for Zero-shot Single Particle Tracking in Liquid Phase Transmission Electron Microscopy | Jan 6, 2025 | Video SegmentationVideo Semantic Segmentation | CodeCode Available | 0 |
| EntitySAM: Segment Everything in Video | Jan 1, 2025 | DecoderObject | —Unverified | 0 |
| Decoupled Motion Expression Video Segmentation | Jan 1, 2025 | Instance SegmentationReferring Video Object Segmentation | —Unverified | 0 |
| VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos | Jan 1, 2025 | Large Language ModelVideo Segmentation | —Unverified | 0 |
| HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver | Jan 1, 2025 | Reasoning SegmentationSegmentation | CodeCode Available | 2 |
| Is Segment Anything Model 2 All You Need for Surgery Video Segmentation? A Systematic Evaluation | Dec 31, 2024 | AllSegmentation | —Unverified | 0 |
| Generative Video Propagation | Dec 27, 2024 | Image to Video GenerationVideo Generation | —Unverified | 0 |