| Learning to navigate by distilling visual information and natural language instructions | Jan 1, 2018 | NavigateZero-shot Generalization | —Unverified | 0 |
| Learning to Represent State with Perceptual Schemata | Jun 13, 2021 | Zero-shot Generalization | —Unverified | 0 |
| Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains | Feb 24, 2023 | reinforcement-learningReinforcement Learning | —Unverified | 0 |
| LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction | Jun 16, 2025 | Instruction FollowingVision-Language-Action | —Unverified | 0 |
| Light Field Diffusion for Single-View Novel View Synthesis | Sep 20, 2023 | DenoisingNovel View Synthesis | —Unverified | 0 |
| LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias | Oct 22, 2024 | 3DGSDecoder | —Unverified | 0 |
| Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion | Dec 18, 2024 | DenoisingDepth Completion | —Unverified | 0 |
| MASP: Scalable GNN-based Planning for Multi-Agent Navigation | Dec 5, 2023 | Reinforcement Learning (RL)Zero-shot Generalization | —Unverified | 0 |
| Matching options to tasks using Option-Indexed Hierarchical Reinforcement Learning | Jun 12, 2022 | Continual LearningHierarchical Reinforcement Learning | —Unverified | 0 |
| F^2Depth: Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map Synthesis | Mar 27, 2024 | Depth EstimationIndoor Monocular Depth Estimation | —Unverified | 0 |
| SAM^Med: A medical image annotation framework based on large vision model | Jul 11, 2023 | Image SegmentationLiver Segmentation | —Unverified | 0 |
| Mechanistic Understandings of Representation Vulnerabilities and Engineering Robust Vision Transformers | Feb 7, 2025 | Zero-shot Generalization | —Unverified | 0 |
| MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching | Jan 20, 2025 | Keypoint DetectionZero-shot Generalization | —Unverified | 0 |
| Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning | Dec 19, 2023 | DiversityInstruction Following | —Unverified | 0 |
| MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning | Dec 14, 2023 | DecoderLanguage Modelling | —Unverified | 0 |
| Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching | Nov 14, 2024 | Depth EstimationKnowledge Distillation | —Unverified | 0 |
| Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio | Dec 23, 2024 | Contrastive LearningPrompt Learning | —Unverified | 0 |
| Multi-View Unsupervised Image Generation with Cross Attention Guidance | Dec 7, 2023 | Hard AttentionImage Generation | —Unverified | 0 |
| Neural Attention Memory | Feb 18, 2023 | Few-Shot LearningZero-shot Generalization | —Unverified | 0 |
| Neural Field Dynamics Model for Granular Object Piles Manipulation | Nov 1, 2023 | ObjectZero-shot Generalization | —Unverified | 0 |
| NeuralSCF: Neural network self-consistent fields for density functional theory | Jun 22, 2024 | Zero-shot Generalization | —Unverified | 0 |
| NVSPolicy: Adaptive Novel-View Synthesis for Generalizable Language-Conditioned Policy Learning | May 15, 2025 | Novel View SynthesisRobot Manipulation | —Unverified | 0 |
| On the Evaluation of Generative Robotic Simulations | Oct 10, 2024 | Diversitytext similarity | —Unverified | 0 |
| On the Out-Of-Distribution Generalization of Multimodal Large Language Models | Feb 9, 2024 | In-Context LearningOut-of-Distribution Generalization | —Unverified | 0 |
| On the Out-Of-Distribution Generalization of Large Multimodal Models | Jan 1, 2025 | In-Context LearningOut-of-Distribution Generalization | —Unverified | 0 |
| On the Performance of Multimodal Language Models | Oct 4, 2023 | BenchmarkingBinary Classification | —Unverified | 0 |
| On the Use of Linguistic Features for the Evaluation of Generative Dialogue Systems | Apr 13, 2021 | Task-Oriented Dialogue SystemsZero-shot Generalization | —Unverified | 0 |
| On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach | Jan 1, 2025 | Adversarial RobustnessZero-shot Generalization | —Unverified | 0 |
| On the Zero-Shot Generalization of Machine-Generated Text Detectors | Oct 8, 2023 | Zero-shot Generalization | —Unverified | 0 |
| OpenSU3D: Open World 3D Scene Understanding using Foundation Models | Jul 19, 2024 | Scene UnderstandingSpatial Reasoning | —Unverified | 0 |
| ORQA: A Benchmark and Foundation Model for Holistic Operating Room Modeling | May 19, 2025 | Graph GenerationKnowledge Distillation | —Unverified | 0 |
| OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction | Mar 5, 2025 | Vision-Language-ActionZero-shot Generalization | —Unverified | 0 |
| Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation | Aug 7, 2024 | Adversarial RobustnessImage Segmentation | —Unverified | 0 |
| PhD Thesis: Exploring the role of (self-)attention in cognitive and computer vision architecture | Jun 26, 2023 | Visual ReasoningZero-shot Generalization | —Unverified | 0 |
| PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment | Dec 18, 2022 | Data AugmentationDialogue Evaluation | —Unverified | 0 |
| PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM | Mar 10, 2025 | DecoderPose Estimation | —Unverified | 0 |
| From Pixels to Predicates: Learning Symbolic World Models via Pretrained Vision-Language Models | Dec 31, 2024 | Decision MakingZero-shot Generalization | —Unverified | 0 |
| Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization | May 8, 2025 | Object LocalizationWeakly-Supervised Object Localization | —Unverified | 0 |
| Program Guided Agent | May 1, 2020 | MinecraftZero-shot Generalization | —Unverified | 0 |
| Prompt-based Visual Alignment for Zero-shot Policy Transfer | Jun 5, 2024 | Autonomous DrivingLanguage Modelling | —Unverified | 0 |
| PromptSync: Bridging Domain Gaps in Vision-Language Models through Class-Aware Prototype Alignment and Discrimination | Apr 11, 2024 | Contrastive LearningDomain Generalization | —Unverified | 0 |
| RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in Multi-Agent Deep Reinforcement Learning | Jun 2, 2022 | Deep Reinforcement LearningReinforcement Learning (RL) | —Unverified | 0 |
| RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks | Mar 4, 2025 | Multi-Agent Path FindingZero-shot Generalization | —Unverified | 0 |
| RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering | Aug 1, 2020 | Style TransferZero-shot Generalization | —Unverified | 0 |
| Real-Time Anomaly Detection and Reactive Planning with Large Language Models | Jul 11, 2024 | Anomaly DetectionAutonomous Vehicles | —Unverified | 0 |
| Reinforcement Learning of Implicit and Explicit Control Flow in Instructions | Feb 25, 2021 | Minecraftreinforcement-learning | —Unverified | 0 |
| Revisiting the Robust Generalization of Adversarial Prompt Tuning | May 18, 2024 | Adversarial RobustnessPrompt Learning | —Unverified | 0 |
| RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering | Nov 16, 2021 | Entity LinkingKnowledge Base Question Answering | —Unverified | 0 |
| Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation | Jan 8, 2025 | Code GenerationLanguage Modeling | —Unverified | 0 |
| Robot Skill Generalization via Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models | Oct 23, 2023 | Skill GeneralizationZero-shot Generalization | —Unverified | 0 |