| WRT-SAM: Foundation Model-Driven Segmentation for Generalized Weld Radiographic Testing | Feb 17, 2025 | Anomaly DetectionImage Segmentation | —Unverified | 0 |
| Salience-Invariant Consistent Policy Learning for Generalization in Visual Reinforcement Learning | Feb 12, 2025 | Zero-shot Generalization | —Unverified | 0 |
| Mechanistic Understandings of Representation Vulnerabilities and Engineering Robust Vision Transformers | Feb 7, 2025 | Zero-shot Generalization | —Unverified | 0 |
| LR0.FM: Low-Res Benchmark and Improving Robustness for Zero-Shot Classification in Foundation Models | Feb 6, 2025 | zero-shot-classificationZero-shot Generalization | CodeCode Available | 1 |
| SimSort: A Data-Driven Framework for Spike Sorting by Large-Scale Electrophysiology Simulation | Feb 5, 2025 | Spike SortingZero-shot Generalization | —Unverified | 0 |
| Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning | Feb 3, 2025 | Meta Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| FlexiCrackNet: A Flexible Pipeline for Enhanced Crack Segmentation with General Features Transfered from SAM | Jan 31, 2025 | Computational EfficiencyCrack Segmentation | —Unverified | 0 |
| Test-time Loss Landscape Adaptation for Zero-Shot Generalization in Vision-Language Models | Jan 31, 2025 | Domain GeneralizationTest-time Adaptation | —Unverified | 0 |
| A Zero-Shot Generalization Framework for LLM-Driven Cross-Domain Sequential Recommendation | Jan 31, 2025 | Sequential RecommendationTransfer Learning | —Unverified | 0 |
| DynaPrompt: Dynamic Test-Time Prompt Tuning | Jan 27, 2025 | Zero-shot Generalization | —Unverified | 0 |
| Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks | Jan 23, 2025 | Trajectory PlanningZero-shot Generalization | —Unverified | 0 |
| State Combinatorial Generalization In Decision Making With Conditional Diffusion Models | Jan 22, 2025 | Decision MakingReinforcement Learning (RL) | —Unverified | 0 |
| Survey on Monocular Metric Depth Estimation | Jan 21, 2025 | 3D ReconstructionAutonomous Driving | —Unverified | 0 |
| MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching | Jan 20, 2025 | Keypoint DetectionZero-shot Generalization | —Unverified | 0 |
| Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective | Jan 19, 2025 | Automated Theorem ProvingMath | —Unverified | 0 |
| Zero-Shot Monocular Scene Flow Estimation in the Wild | Jan 17, 2025 | Depth EstimationPrediction | —Unverified | 0 |
| FoundationStereo: Zero-Shot Stereo Matching | Jan 17, 2025 | Depth EstimationDiversity | CodeCode Available | 7 |
| DEFOM-Stereo: Depth Foundation Model Based Stereo Matching | Jan 16, 2025 | Depth EstimationDisparity Estimation | CodeCode Available | 3 |
| MonSter: Marry Monodepth to Stereo Unleashes Power | Jan 15, 2025 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 4 |
| StereoGen: High-quality Stereo Image Generation from a Single Image | Jan 15, 2025 | Depth EstimationImage Generation | —Unverified | 0 |
| Capability-Aware Shared Hypernetworks for Flexible Heterogeneous Multi-Robot Coordination | Jan 10, 2025 | DiversityImitation Learning | CodeCode Available | 0 |
| Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence | Jan 9, 2025 | Change DetectionZero-shot Generalization | CodeCode Available | 1 |
| Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation | Jan 8, 2025 | Code GenerationLanguage Modeling | —Unverified | 0 |
| MADation: Face Morphing Attack Detection with Foundation Models | Jan 7, 2025 | Face Morphing Attack DetectionFace Recognition | CodeCode Available | 0 |
| Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera | Jan 5, 2025 | Data AugmentationDepth Estimation | CodeCode Available | 3 |
| Spot Risks Before Speaking! Unraveling Safety Attention Heads in Large Vision-Language Models | Jan 3, 2025 | Zero-shot Generalization | CodeCode Available | 0 |
| On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach | Jan 1, 2025 | Adversarial RobustnessZero-shot Generalization | —Unverified | 0 |
| OW-OVD: Unified Open World and Open Vocabulary Object Detection | Jan 1, 2025 | AttributeIncremental Learning | CodeCode Available | 1 |
| On the Out-Of-Distribution Generalization of Large Multimodal Models | Jan 1, 2025 | In-Context LearningOut-of-Distribution Generalization | —Unverified | 0 |
| FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images | Jan 1, 2025 | 3D CanonicalizationZero-shot Generalization | CodeCode Available | 1 |
| From Pixels to Predicates: Learning Symbolic World Models via Pretrained Vision-Language Models | Dec 31, 2024 | Decision MakingZero-shot Generalization | —Unverified | 0 |
| EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation | Dec 25, 2024 | ObjectZero-shot Generalization | —Unverified | 0 |
| Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization | Dec 24, 2024 | Image SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio | Dec 23, 2024 | Contrastive LearningPrompt Learning | —Unverified | 0 |
| Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees | Dec 21, 2024 | Graph Neural NetworkIn-Context Learning | CodeCode Available | 0 |
| CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up | Dec 20, 2024 | 8kGPU | CodeCode Available | 3 |
| Zero-Shot Generalization for Blockage Localization in mmWave Communication | Dec 18, 2024 | Self-Supervised LearningZero-shot Generalization | —Unverified | 0 |
| Memorizing SAM: 3D Medical Segment Anything Model with Memorizing Transformer | Dec 18, 2024 | Image SegmentationMedical Image Analysis | CodeCode Available | 0 |
| Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion | Dec 18, 2024 | DenoisingDepth Completion | —Unverified | 0 |
| Efficient Fine-Tuning of Single-Cell Foundation Models Enables Zero-Shot Molecular Perturbation Prediction | Dec 18, 2024 | Drug DiscoveryZero-shot Generalization | —Unverified | 0 |
| Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning | Dec 17, 2024 | Denoising | CodeCode Available | 2 |
| WiFo: Wireless Foundation Model for Channel Prediction | Dec 12, 2024 | modelMulti-Task Learning | —Unverified | 0 |
| Towards Open-Vocabulary Video Semantic Segmentation | Dec 12, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM | Dec 12, 2024 | Image ComprehensionImage Generation | —Unverified | 0 |
| Disentanglement and Compositionality of Letter Identity and Letter Position in Variational Auto-Encoder Vision Models | Dec 11, 2024 | DisentanglementPosition | —Unverified | 0 |
| Large Concept Models: Language Modeling in a Sentence Representation Space | Dec 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation | Dec 11, 2024 | MambaSegmentation | CodeCode Available | 1 |
| Lightweight Method for Interactive 3D Medical Image Segmentation with Multi-Round Result Fusion | Dec 11, 2024 | GPUImage Segmentation | CodeCode Available | 0 |
| ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning | Dec 10, 2024 | Evolutionary AlgorithmsLifelong learning | CodeCode Available | 0 |
| Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation | Dec 9, 2024 | Domain AdaptationImage Segmentation | CodeCode Available | 1 |