| MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training | Nov 23, 2024 | Computed Tomography (CT)Image Segmentation | CodeCode Available | 1 |
| Revisiting the Integration of Convolution and Attention for Vision Backbone | Nov 21, 2024 | Semantic SegmentationWeakly supervised Semantic Segmentation | CodeCode Available | 1 |
| CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation | Nov 21, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 1 |
| XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation | Nov 20, 2024 | 3D geometry3D Semantic Segmentation | CodeCode Available | 1 |
| ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements | Nov 18, 2024 | Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation | CodeCode Available | 1 |
| RETR: Multi-View Radar Detection Transformer for Indoor Perception | Nov 15, 2024 | Instance Segmentationobject-detection | CodeCode Available | 1 |
| OneNet: A Channel-Wise 1D Convolutional U-Net | Nov 14, 2024 | DecoderImage Segmentation | CodeCode Available | 1 |
| Fast and Efficient Transformer-based Method for Bird's Eye View Instance Prediction | Nov 11, 2024 | Autonomous VehiclesInstance Segmentation | CodeCode Available | 1 |
| Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantification | Nov 11, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 1 |
| ZAHA: Introducing the Level of Facade Generalization and the Large-Scale Point Cloud Facade Semantic Segmentation Benchmark Dataset | Nov 7, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts | Nov 6, 2024 | Domain GeneralizationOut of Distribution (OOD) Detection | CodeCode Available | 1 |
| LiVOS: Light Video Object Segmentation with Gated Linear Matching | Nov 5, 2024 | GPUSemantic Segmentation | CodeCode Available | 1 |
| Rethinking Decoders for Transformer-based Semantic Segmentation: A Compression Perspective | Nov 5, 2024 | DecoderSegmentation | CodeCode Available | 1 |
| Automated Classification of Cell Shapes: A Comparative Evaluation of Shape Descriptors | Nov 1, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 1 |
| MLLA-UNet: Mamba-like Linear Attention in an Efficient U-Shape Model for Medical Image Segmentation | Oct 31, 2024 | Image SegmentationMamba | CodeCode Available | 1 |
| Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model | Oct 31, 2024 | Semantic SegmentationSpecificity | CodeCode Available | 1 |
| COSNet: A Novel Semantic Segmentation Network using Enhanced Boundaries in Cluttered Scenes | Oct 31, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation | Oct 29, 2024 | Domain AdaptationPseudo Label | CodeCode Available | 1 |
| Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic Segmentation | Oct 29, 2024 | Cross-Domain Few-ShotFew-Shot Semantic Segmentation | CodeCode Available | 1 |
| IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream Tasks | Oct 28, 2024 | Domain Adaptationobject-detection | CodeCode Available | 1 |
| Unlocking Comics: The AI4VA Dataset for Visual Understanding | Oct 27, 2024 | Depth EstimationSaliency Detection | CodeCode Available | 1 |
| Fusion-then-Distillation: Toward Cross-modal Positive Distillation for Domain Adaptive 3D Semantic Segmentation | Oct 25, 2024 | 3D Semantic SegmentationDomain Adaptation | CodeCode Available | 1 |
| Context-Based Visual-Language Place Recognition | Oct 25, 2024 | Semantic SegmentationVisual Place Recognition | CodeCode Available | 1 |
| Gaze-Assisted Medical Image Segmentation | Oct 23, 2024 | DiagnosticImage Segmentation | CodeCode Available | 1 |
| ROCKET-1: Mastering Open-World Interaction with Visual-Temporal Context Prompting | Oct 23, 2024 | Decision MakingMinecraft | CodeCode Available | 1 |