| ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature Enhancement | Dec 11, 2024 | DecoderImage Segmentation | CodeCode Available | 2 |
| SegFace: Face Segmentation of Long-Tail Classes | Dec 11, 2024 | Face ParsingFace Swapping | CodeCode Available | 2 |
| DreamColour: Controllable Video Colour Editing without Training | Dec 6, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation | Dec 5, 2024 | Semantic SegmentationTime Series | CodeCode Available | 2 |
| Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation | Dec 5, 2024 | Image SegmentationOpen Vocabulary Semantic Segmentation | CodeCode Available | 2 |
| SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning | Dec 5, 2024 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| FLAIR: VLM with Fine-grained Language-informed Image Representations | Dec 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| 2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification | Dec 1, 2024 | Computational Efficiencyimage-classification | CodeCode Available | 2 |
| TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba | Nov 26, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation | Nov 26, 2024 | Image SegmentationMedical Image Analysis | CodeCode Available | 2 |
| HyperSeg: Towards Universal Visual Segmentation with Large Language Model | Nov 26, 2024 | Language ModelingLarge Language Model | CodeCode Available | 2 |
| An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models | Nov 25, 2024 | DenoisingScene Understanding | CodeCode Available | 2 |
| Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training | Nov 25, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation | Nov 24, 2024 | Semantic Segmentation | CodeCode Available | 2 |
| ResCLIP: Residual Attention for Training-free Dense Vision-language Inference | Nov 24, 2024 | AttributeSemantic Segmentation | CodeCode Available | 2 |
| IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos | Nov 18, 2024 | Pose EstimationSemantic Segmentation | CodeCode Available | 2 |
| CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation | Nov 15, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 2 |
| Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation | Nov 14, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation | Oct 30, 2024 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| Multimodality Helps Few-Shot 3D Point Cloud Semantic Segmentation | Oct 29, 2024 | Few-shot 3D Point Cloud Semantic SegmentationPoint Cloud Segmentation | CodeCode Available | 2 |
| Domain Adaptation with a Single Vision-Language Embedding | Oct 28, 2024 | Domain AdaptationOne-shot Unsupervised Domain Adaptation | CodeCode Available | 2 |
| Moving Object Segmentation in Point Cloud Data using Hidden Markov Models | Oct 24, 2024 | Semantic Segmentation | CodeCode Available | 2 |
| CARLA2Real: a tool for reducing the sim2real gap in CARLA simulator | Oct 23, 2024 | Autonomous DrivingSelf-Driving Cars | CodeCode Available | 2 |
| DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Oct 22, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 |
| TIPS: Text-Image Pretraining with Spatial Awareness | Oct 21, 2024 | Depth EstimationImage Captioning | CodeCode Available | 2 |