| A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs | May 19, 2025 | Machine Translationnamed-entity-recognition | CodeCode Available | 0 |
| AoP-SAM: Automation of Prompts for Efficient Segmentation | May 17, 2025 | Image SegmentationPrompt Engineering | —Unverified | 0 |
| RVTBench: A Benchmark for Visual Reasoning Tasks | May 17, 2025 | Reasoning SegmentationVisual Question Answering (VQA) | CodeCode Available | 0 |
| GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction | May 16, 2025 | General KnowledgeZero-shot Generalization | CodeCode Available | 0 |
| NVSPolicy: Adaptive Novel-View Synthesis for Generalizable Language-Conditioned Policy Learning | May 15, 2025 | Novel View SynthesisRobot Manipulation | —Unverified | 0 |
| Depth Anything with Any Prior | May 15, 2025 | Depth CompletionDepth Estimation | —Unverified | 0 |
| Denoising and Alignment: Rethinking Domain Generalization for Multimodal Face Anti-Spoofing | May 14, 2025 | cross-modal alignmentDenoising | —Unverified | 0 |
| Visual Image Reconstruction from Brain Activity via Latent Representation | May 13, 2025 | Early ClassificationImage Reconstruction | —Unverified | 0 |
| Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence | May 11, 2025 | Computational EfficiencyFederated Learning | —Unverified | 0 |
| Learning Graph Representation of Agent Diffusers | May 10, 2025 | Graph Neural NetworkImage Generation | CodeCode Available | 0 |
| Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization | May 8, 2025 | Object LocalizationWeakly-Supervised Object Localization | —Unverified | 0 |
| TeDA: Boosting Vision-Lanuage Models for Zero-Shot 3D Object Retrieval via Testing-time Distribution Alignment | May 5, 2025 | 3D Object RetrievalLanguage Modeling | CodeCode Available | 0 |
| A Review of 3D Object Detection with Vision-Language Models | Apr 25, 2025 | 3D Object DetectionObject | —Unverified | 0 |
| Text-to-Decision Agent: Learning Generalist Policies from Natural Language Supervision | Apr 21, 2025 | MuJoCoZero-shot Generalization | —Unverified | 0 |
| Dysarthria Normalization via Local Lie Group Transformations for Robust ASR | Apr 16, 2025 | Robust Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| Evolutionary Prompt Optimization Discovers Emergent Multimodal Reasoning Strategies in Vision-Language Models | Mar 30, 2025 | Image SegmentationLanguage Modeling | —Unverified | 0 |
| Zero-shot Domain Generalization of Foundational Models for 3D Medical Image Segmentation: An Experimental Study | Mar 28, 2025 | Domain GeneralizationImage Segmentation | —Unverified | 0 |
| Unpaired Object-Level SAR-to-Optical Image Translation for Aircraft with Keypoints-Guided Diffusion Models | Mar 25, 2025 | TranslationZero-shot Generalization | —Unverified | 0 |
| Thinking agents for zero-shot generalization to qualitatively novel tasks | Mar 25, 2025 | Zero-shot Generalization | —Unverified | 0 |
| Aether: Geometric-Aware Unified World Modeling | Mar 24, 2025 | Dynamic ReconstructionPrediction | —Unverified | 0 |
| Enhancing Zero-Shot Image Recognition in Vision-Language Models through Human-like Concept Guidance | Mar 20, 2025 | Prompt EngineeringZero-shot Generalization | —Unverified | 0 |
| Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation | Mar 20, 2025 | Depth EstimationImage Reconstruction | —Unverified | 0 |
| GenM^3: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation | Mar 19, 2025 | Large Language ModelMotion Generation | —Unverified | 0 |
| Good Actions Succeed, Bad Actions Generalize: A Case Study on Why RL Generalizes Better | Mar 19, 2025 | AttributeReinforcement Learning (RL) | —Unverified | 0 |
| Learning with Expert Abstractions for Efficient Multi-Task Continuous Control | Mar 19, 2025 | continuous-controlContinuous Control | CodeCode Available | 0 |
| Foundation Feature-Driven Online End-Effector Pose Estimation: A Marker-Free and Learning-Free Approach | Mar 18, 2025 | 6D Pose EstimationPose Estimation | —Unverified | 0 |
| Compound Expression Recognition via Large Vision-Language Models | Mar 14, 2025 | Emotion RecognitionZero-shot Generalization | —Unverified | 0 |
| Prompt-OT: An Optimal Transport Regularization Paradigm for Knowledge Preservation in Vision-Language Model Adaptation | Mar 11, 2025 | Domain GeneralizationLanguage Modeling | CodeCode Available | 0 |
| A Recipe for Improving Remote Sensing VLM Zero Shot Generalization | Mar 10, 2025 | Cross-Modal RetrievalZero-Shot Cross-Modal Retrieval | —Unverified | 0 |
| PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM | Mar 10, 2025 | DecoderPose Estimation | —Unverified | 0 |
| OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction | Mar 5, 2025 | Vision-Language-ActionZero-shot Generalization | —Unverified | 0 |
| RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks | Mar 4, 2025 | Multi-Agent Path FindingZero-shot Generalization | —Unverified | 0 |
| Re-Imagining Multimodal Instruction Tuning: A Representation View | Mar 2, 2025 | Instruction FollowingMME | CodeCode Available | 0 |
| Contrastive Learning of English Language and Crystal Graphs for Multimodal Representation of Materials Knowledge | Feb 23, 2025 | Contrastive LearningZero-shot Generalization | —Unverified | 0 |
| Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models | Feb 20, 2025 | Reinforcement Learning (RL)Zero-shot Generalization | —Unverified | 0 |
| GeLLMO: Generalizing Large Language Models for Multi-property Molecule Optimization | Feb 19, 2025 | Zero-shot Generalization | CodeCode Available | 0 |
| WRT-SAM: Foundation Model-Driven Segmentation for Generalized Weld Radiographic Testing | Feb 17, 2025 | Anomaly DetectionImage Segmentation | —Unverified | 0 |
| Salience-Invariant Consistent Policy Learning for Generalization in Visual Reinforcement Learning | Feb 12, 2025 | Zero-shot Generalization | —Unverified | 0 |
| Mechanistic Understandings of Representation Vulnerabilities and Engineering Robust Vision Transformers | Feb 7, 2025 | Zero-shot Generalization | —Unverified | 0 |
| SimSort: A Data-Driven Framework for Spike Sorting by Large-Scale Electrophysiology Simulation | Feb 5, 2025 | Spike SortingZero-shot Generalization | —Unverified | 0 |
| Toward Task Generalization via Memory Augmentation in Meta-Reinforcement Learning | Feb 3, 2025 | Meta Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| FlexiCrackNet: A Flexible Pipeline for Enhanced Crack Segmentation with General Features Transfered from SAM | Jan 31, 2025 | Computational EfficiencyCrack Segmentation | —Unverified | 0 |
| Test-time Loss Landscape Adaptation for Zero-Shot Generalization in Vision-Language Models | Jan 31, 2025 | Domain GeneralizationTest-time Adaptation | —Unverified | 0 |
| A Zero-Shot Generalization Framework for LLM-Driven Cross-Domain Sequential Recommendation | Jan 31, 2025 | Sequential RecommendationTransfer Learning | —Unverified | 0 |
| DynaPrompt: Dynamic Test-Time Prompt Tuning | Jan 27, 2025 | Zero-shot Generalization | —Unverified | 0 |
| Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks | Jan 23, 2025 | Trajectory PlanningZero-shot Generalization | —Unverified | 0 |
| State Combinatorial Generalization In Decision Making With Conditional Diffusion Models | Jan 22, 2025 | Decision MakingReinforcement Learning (RL) | —Unverified | 0 |
| Survey on Monocular Metric Depth Estimation | Jan 21, 2025 | 3D ReconstructionAutonomous Driving | —Unverified | 0 |
| MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching | Jan 20, 2025 | Keypoint DetectionZero-shot Generalization | —Unverified | 0 |
| Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective | Jan 19, 2025 | Automated Theorem ProvingMath | —Unverified | 0 |