| Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision | Apr 3, 2025 | 3D Object Detectioncross-modal alignment | CodeCode Available | 1 |
| Mean Shift Mask Transformer for Unseen Object Instance Segmentation | Nov 21, 2022 | ClusteringImage Segmentation | CodeCode Available | 1 |
| ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills | Feb 9, 2023 | GPUImitation Learning | CodeCode Available | 1 |
| Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning | Mar 2, 2025 | Large Language ModelMulti-Instance Retrieval | CodeCode Available | 1 |
| CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks | Dec 6, 2021 | Continuous ControlImitation Learning | CodeCode Available | 1 |
| CACTI: A Framework for Scalable Multi-Task Multi-Scene Visual Imitation Learning | Dec 12, 2022 | Data AugmentationImage Generation | CodeCode Available | 1 |
| ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model | Feb 20, 2025 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| One-Shot Object Affordance Detection in the Wild | Aug 8, 2021 | Action RecognitionAffordance Detection | CodeCode Available | 1 |
| Motion Policy Networks | Oct 21, 2022 | Motion GenerationMotion Planning | CodeCode Available | 1 |
| BusyBot: Learning to Interact, Reason, and Plan in a BusyBoard Environment | Jul 17, 2022 | Causal DiscoveryRobot Manipulation | CodeCode Available | 1 |
| Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation | Jun 23, 2021 | Continuous ControlQ-Learning | CodeCode Available | 1 |
| Coarse-to-fine Q-attention with Tree Expansion | Apr 26, 2022 | Robot Manipulation | CodeCode Available | 1 |
| BundleTrack: 6D Pose Tracking for Novel Objects without Instance or Category-Level 3D Models | Aug 1, 2021 | 3D Object Tracking6D Pose Estimation | CodeCode Available | 1 |
| Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation | Jun 15, 2024 | Imitation LearningInductive Bias | CodeCode Available | 1 |
| LTLDoG: Satisfying Temporally-Extended Symbolic Constraints for Safe Diffusion-based Planning | May 7, 2024 | Offline RLRobot Manipulation | CodeCode Available | 1 |
| PixL2R: Guiding Reinforcement Learning Using Natural Language by Mapping Pixels to Rewards | Jul 30, 2020 | reinforcement-learningReinforcement Learning (RL) | CodeCode Available | 1 |
| Language-Conditioned Imitation Learning for Robot Manipulation Tasks | Oct 22, 2020 | Imitation LearningRobot Manipulation | CodeCode Available | 1 |
| Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning | Feb 8, 2022 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Language Reward Modulation for Pretraining Reinforcement Learning | Aug 23, 2023 | reinforcement-learningReinforcement Learning | CodeCode Available | 1 |
| HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction | Jun 10, 2024 | 3D Reconstructionhand-object pose | CodeCode Available | 1 |
| GUARD: A Safe Reinforcement Learning Benchmark | May 23, 2023 | Autonomous DrivingDiversity | CodeCode Available | 1 |
| Instruction-driven history-aware policies for robotic manipulations | Sep 11, 2022 | Robot ManipulationRobot Manipulation Generalization | CodeCode Available | 1 |
| 3DFlowAction: Learning Cross-Embodiment Manipulation from 3D Flow World Model | Jun 6, 2025 | Optical Flow EstimationRobot Manipulation | CodeCode Available | 1 |
| Goal-Conditioned Imitation Learning using Score-based Diffusion Policies | Apr 5, 2023 | DenoisingImitation Learning | CodeCode Available | 1 |
| Generating Annotated Training Data for 6D Object Pose Estimation in Operational Environments with Minimal User Interaction | Mar 17, 2021 | 6D Pose Estimation using RGBPose Estimation | CodeCode Available | 1 |