| ARC-Calib: Autonomous Markerless Camera-to-Robot Calibration via Exploratory Robot Motions | Mar 18, 2025 | ARCRobot Manipulation | —Unverified | 0 |
| ReBot: Scaling Robot Learning with Real-to-Sim-to-Real Robotic Video Synthesis | Mar 15, 2025 | Domain GeneralizationRobot Manipulation | —Unverified | 0 |
| Is Your Imitation Learning Policy Better than Mine? Policy Comparison with Near-Optimal Stopping | Mar 14, 2025 | Imitation LearningRobot Manipulation | —Unverified | 0 |
| Deformable Linear Object Surface Placement Using Elastica Planning and Local Shape Control | Mar 11, 2025 | Robot Manipulation | —Unverified | 0 |
| Combined Physics and Event Camera Simulator for Slip Detection | Mar 5, 2025 | Robot Manipulation | CodeCode Available | 0 |
| A Taxonomy for Evaluating Generalist Robot Policies | Mar 3, 2025 | Robot ManipulationVision-Language-Action | —Unverified | 0 |
| Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning | Mar 2, 2025 | Large Language ModelMulti-Instance Retrieval | CodeCode Available | 1 |
| Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids | Feb 27, 2025 | Contact-rich Manipulationreinforcement-learning | —Unverified | 0 |
| Enhancing Reusability of Learned Skills for Robot Manipulation via Gaze and Bottleneck | Feb 25, 2025 | Imitation LearningObject | —Unverified | 0 |
| ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model | Feb 20, 2025 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| Magma: A Foundation Model for Multimodal AI Agents | Feb 18, 2025 | Autonomous Web NavigationImage to text | CodeCode Available | 5 |
| SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation | Feb 18, 2025 | Object RearrangementRobot Manipulation | CodeCode Available | 3 |
| Robot Deformable Object Manipulation via NMPC-generated Demonstrations in Deep Reinforcement Learning | Feb 17, 2025 | Deep Reinforcement LearningDeformable Object Manipulation | —Unverified | 0 |
| S^2-Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation | Feb 13, 2025 | Depth EstimationRobot Manipulation | —Unverified | 0 |
| COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping | Feb 12, 2025 | Reinforcement Learning (RL)Robot Manipulation | —Unverified | 0 |
| Select before Act: Spatially Decoupled Action Repetition for Continuous Control | Feb 10, 2025 | continuous-controlContinuous Control | —Unverified | 0 |
| HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation | Feb 8, 2025 | Robot ManipulationVision-Language-Action | —Unverified | 0 |
| AnyPlace: Learning Generalized Object Placement for Robot Manipulation | Feb 6, 2025 | ObjectPose Prediction | —Unverified | 0 |
| Rethinking Latent Redundancy in Behavior Cloning: An Information Bottleneck Approach for Robot Manipulation | Feb 5, 2025 | Imitation LearningRobot Manipulation | —Unverified | 0 |
| UP-VLA: A Unified Understanding and Prediction Model for Embodied Agent | Jan 31, 2025 | Robot ManipulationVision-Language-Action | —Unverified | 0 |
| Strong and Controllable 3D Motion Generation | Jan 30, 2025 | Motion GenerationRobot Manipulation | —Unverified | 0 |
| SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Model | Jan 27, 2025 | Robot Manipulation | —Unverified | 0 |
| Beyond Sight: Finetuning Generalist Robot Policies with Heterogeneous Sensors via Language Grounding | Jan 8, 2025 | Robot ManipulationText Generation | —Unverified | 0 |
| Learning to Transfer Human Hand Skills for Robot Manipulations | Jan 7, 2025 | ObjectRobot Manipulation | —Unverified | 0 |
| 3D-MVP: 3D Multiview Pretraining for Manipulation | Jan 1, 2025 | DecoderRobot Manipulation | —Unverified | 0 |
| Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding | Dec 31, 2024 | Robot ManipulationScene Understanding | —Unverified | 0 |
| Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations | Dec 19, 2024 | Contrastive LearningImage Reconstruction | CodeCode Available | 3 |
| Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models | Dec 18, 2024 | Representation LearningRobot Manipulation | CodeCode Available | 3 |
| RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation | Dec 18, 2024 | DiversityImitation Learning | —Unverified | 0 |
| Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning | Dec 17, 2024 | Denoising | CodeCode Available | 2 |
| Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning | Dec 16, 2024 | HallucinationRobot Manipulation | CodeCode Available | 2 |
| ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation? | Dec 13, 2024 | Robot Manipulation | —Unverified | 0 |
| TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies | Dec 13, 2024 | Robot ManipulationVision-Language-Action | —Unverified | 0 |
| Grasp Diffusion Network: Learning Grasp Generators from Partial Point Clouds with Diffusion Models in SO(3)xR3 | Dec 11, 2024 | Collision AvoidanceRobot Manipulation | —Unverified | 0 |
| P3-PO: Prescriptive Point Priors for Visuo-Spatial Generalization of Robot Policies | Dec 9, 2024 | Out-of-Distribution GeneralizationRobot Manipulation | —Unverified | 0 |
| Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos | Dec 5, 2024 | Robot Manipulation | CodeCode Available | 2 |
| Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning | Dec 4, 2024 | Data AugmentationImitation Learning | —Unverified | 0 |
| From Mystery to Mastery: Failure Diagnosis for Improving Manipulation Policies | Dec 3, 2024 | Deep Reinforcement LearningRobot Manipulation | —Unverified | 0 |
| Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control | Dec 2, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| Prediction with Action: Visual Policy Learning via Joint Denoising Process | Nov 27, 2024 | DenoisingImage Generation | —Unverified | 0 |
| RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics | Nov 25, 2024 | Robot ManipulationScene Understanding | —Unverified | 0 |
| Rethinking the Intermediate Features in Adversarial Attacks: Misleading Robotic Models via Adversarial Distillation | Nov 21, 2024 | Robot Manipulation | —Unverified | 0 |
| Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation | Nov 18, 2024 | Knowledge GraphsRobot Manipulation | CodeCode Available | 0 |
| VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation | Nov 14, 2024 | DenoisingRobot Manipulation | —Unverified | 0 |
| ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy | Nov 6, 2024 | Imitation LearningRobot Manipulation | —Unverified | 0 |
| RT-Affordance: Affordances are Versatile Intermediate Representations for Robot Manipulation | Nov 5, 2024 | Robot Manipulation | —Unverified | 0 |
| Improving Trust Estimation in Human-Robot Collaboration Using Beta Reputation at Fine-grained Timescales | Nov 4, 2024 | Bayesian InferenceBehavioural cloning | CodeCode Available | 0 |
| DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution | Nov 4, 2024 | GPURobot Manipulation | CodeCode Available | 2 |
| Non-rigid Relative Placement through 3D Dense Diffusion | Oct 25, 2024 | ObjectRobot Manipulation | —Unverified | 0 |
| Learning to Look: Seeking Information for Decision Making via Policy Factorization | Oct 24, 2024 | Decision MakingRobot Manipulation | —Unverified | 0 |