| Generative Image as Action Models | Jul 10, 2024 | Image GenerationRobot Manipulation | CodeCode Available | 2 | 5 |
| Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation | Dec 20, 2023 | Robot ManipulationZero-shot Generalization | CodeCode Available | 2 | 5 |
| VIMA: General Robot Manipulation with Multimodal Prompts | Oct 6, 2022 | Imitation LearningLanguage Modelling | CodeCode Available | 2 | 5 |
| VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models | Jul 12, 2023 | FormLanguage Modelling | CodeCode Available | 2 | 5 |
| Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning | Dec 17, 2024 | Denoising | CodeCode Available | 2 | 5 |
| Streaming Diffusion Policy: Fast Policy Synthesis with Variable Noise Diffusion Models | Jun 7, 2024 | DenoisingImage Generation | CodeCode Available | 2 | 5 |
| DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution | Nov 4, 2024 | GPURobot Manipulation | CodeCode Available | 2 | 5 |
| RVT: Robotic View Transformer for 3D Object Manipulation | Jun 26, 2023 | ObjectRobot Manipulation | CodeCode Available | 2 | 5 |
| RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control | Jul 28, 2023 | ObjectQuestion Answering | CodeCode Available | 2 | 5 |
| Autoregressive Action Sequence Learning for Robotic Manipulation | Oct 4, 2024 | ChunkingLanguage Modeling | CodeCode Available | 2 | 5 |
| Robot Trajectron: Trajectory Prediction-based Shared Control for Robot Manipulation | Feb 4, 2024 | PositionRobot Manipulation | CodeCode Available | 2 | 5 |
| Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy | Oct 2, 2024 | Motion PlanningRobot Manipulation | CodeCode Available | 2 | 5 |
| AutoEval: Autonomous Evaluation of Generalist Robot Manipulation Policies in the Real World | Mar 31, 2025 | Robot ManipulationScheduling | CodeCode Available | 2 | 5 |
| Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation | Sep 12, 2022 | Robot ManipulationRobot Manipulation Generalization | CodeCode Available | 2 | 5 |
| Act3D: 3D Feature Field Transformers for Multi-Task Robotic Manipulation | Jun 30, 2023 | Action DetectionPose Prediction | CodeCode Available | 2 | 5 |
| RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulation | Jun 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| R3M: A Universal Visual Representation for Robot Manipulation | Mar 23, 2022 | Contrastive LearningRobot Manipulation | CodeCode Available | 2 | 5 |
| Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos | Dec 5, 2024 | Robot Manipulation | CodeCode Available | 2 | 5 |
| FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation | May 22, 2023 | Imitation LearningMotion Planning | CodeCode Available | 2 | 5 |
| SE(3)-DiffusionFields: Learning smooth cost functions for joint grasp and motion optimization through diffusion | Sep 8, 2022 | Motion PlanningRobot Manipulation | CodeCode Available | 2 | 5 |
| An Embodied Generalist Agent in 3D World | Nov 18, 2023 | 3D dense captioning3D Question Answering (3D-QA) | CodeCode Available | 2 | 5 |
| ABNet: Attention BarrierNet for Safe and Scalable Robot Learning | Jun 18, 2024 | Autonomous DrivingRobot Manipulation | CodeCode Available | 1 | 5 |
| DrS: Learning Reusable Dense Rewards for Multi-Stage Tasks | Apr 25, 2024 | Robot Manipulation | CodeCode Available | 1 | 5 |
| GUARD: A Safe Reinforcement Learning Benchmark | May 23, 2023 | Autonomous DrivingDiversity | CodeCode Available | 1 | 5 |
| BundleTrack: 6D Pose Tracking for Novel Objects without Instance or Category-Level 3D Models | Aug 1, 2021 | 3D Object Tracking6D Pose Estimation | CodeCode Available | 1 | 5 |