| A Survey on GUI Agents with Foundation Models Enhanced by Reinforcement Learning | Apr 29, 2025 | Action GenerationPrompt Engineering | —Unverified | 0 |
| AssistGUI: Task-Oriented PC Graphical User Interface Automation | Jan 1, 2024 | Action GenerationLanguage Modeling | —Unverified | 0 |
| VDT-Auto: End-to-end Autonomous Driving with VLM-Guided Diffusion Transformers | Feb 27, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 |
| Marginal Utility for Planning in Continuous or Large Discrete Action Spaces | Jun 10, 2020 | Action Generation | —Unverified | 0 |
| Masked Path Modeling for Vision-and-Language Navigation | May 23, 2023 | Action GenerationNavigate | —Unverified | 0 |
| MemoNav: Selecting Informative Memories for Visual Navigation | Aug 20, 2022 | Action GenerationGraph Attention | —Unverified | 0 |
| Mind with Eyes: from Language Reasoning to Multimodal Reasoning | Mar 23, 2025 | Action GenerationMultimodal Reasoning | —Unverified | 0 |
| Modality Selection and Skill Segmentation via Cross-Modality Attention | Apr 20, 2025 | Action GenerationContact-rich Manipulation | —Unverified | 0 |
| VITA: Vision-to-Action Flow Matching Policy | Jul 17, 2025 | Action Generation | —Unverified | 0 |
| ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation | Mar 25, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 |