| DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge | Jul 6, 2025 | Image GenerationMultimodal Reasoning | CodeCode Available | 3 |
| Geometry-aware 4D Video Generation for Robot Manipulation | Jul 1, 2025 | Robot ManipulationVideo Generation | —Unverified | 0 |
| CapsDT: Diffusion-Transformer for Capsule Robot Manipulation | Jun 19, 2025 | DiagnosticRobot Manipulation | —Unverified | 0 |
| Robust Instant Policy: Leveraging Student's t-Regression Model for Robust In-context Imitation Learning of Robot Manipulation | Jun 18, 2025 | HallucinationImitation Learning | —Unverified | 0 |
| SENIOR: Efficient Query Selection and Preference-Guided Exploration in Preference-based Reinforcement Learning | Jun 17, 2025 | Density EstimationRobot Manipulation | —Unverified | 0 |
| What Matters in Learning from Large-Scale Datasets for Robot Manipulation | Jun 16, 2025 | DiversityImitation Learning | —Unverified | 0 |
| Demonstrating Multi-Suction Item Picking at Scale via Multi-Modal Learning of Pick Success | Jun 12, 2025 | Robot ManipulationSemantic Segmentation | —Unverified | 0 |
| BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models | Jun 9, 2025 | Robot ManipulationVision-Language-Action | —Unverified | 0 |
| 3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model | Jun 6, 2025 | Optical Flow EstimationRobot Manipulation | CodeCode Available | 1 |
| OG-VLA: 3D-Aware Vision Language Action Model via Orthographic Image Generation | Jun 1, 2025 | Image GenerationLarge Language Model | —Unverified | 0 |