SOTAVerified

MuJoCo

Papers

Showing 251300 of 677 papers

TitleStatusHype
Continuous Control With Ensemble Deep Deterministic Policy GradientsCode0
Offline Reinforcement Learning via Inverse OptimizationCode0
Imitation Learning from Purified DemonstrationsCode0
Imitation Learning from Observations under Transition Model DisparityCode0
Asynchronous Methods for Model-Based Reinforcement LearningCode0
MuJoCo: A physics engine for model-based controlCode0
Off-Policy Average Reward Actor-Critic with Deterministic Policy SearchCode0
ORRB -- OpenAI Remote Rendering BackendCode0
Context-Based Soft Actor Critic for Environments with Non-stationary DynamicsCode0
Weak Human Preference Supervision For Deep Reinforcement LearningCode0
Imitating from auxiliary imperfect demonstrations via Adversarial Density Weighted RegressionCode0
Constrained Intrinsic Motivation for Reinforcement LearningCode0
Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex EnvironmentsCode0
MDP Playground: An Analysis and Debug Testbed for Reinforcement LearningCode0
Mildly Constrained Evaluation Policy for Offline Reinforcement LearningCode0
Lyapunov-based Safe Policy Optimization for Continuous ControlCode0
Online Reinforcement Learning in Non-Stationary Context-Driven EnvironmentsCode0
Locally Persistent Exploration in Continuous Control Tasks with Sparse RewardsCode0
Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent Policy OptimizationCode0
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision ScenariosCode0
Live in the Moment: Learning Dynamics Model Adapted to Evolving PolicyCode0
LLMs for sensory-motor control: Combining in-context and iterative learningCode0
Hard-Thresholding Meets Evolution Strategies in Reinforcement LearningCode0
Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy ImitationCode0
Learning to Play Cup-and-Ball with Noisy Camera ObservationsCode0
Hierarchical Reinforcement Learning with Advantage-Based Auxiliary RewardsCode0
Handling Delay in Real-Time Reinforcement LearningCode0
Learning Task Belief Similarity with Latent Dynamics for Meta-Reinforcement LearningCode0
Learning What To Do by Simulating the PastCode0
Learning Powerful Policies by Using Consistent Dynamics ModelCode0
Comparing Model-free and Model-based Algorithms for Offline Reinforcement LearningCode0
Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning ApproachCode0
Learning non-Markovian Decision-Making from State-only SequencesCode0
Learning Calibratable Policies using Programmatic Style-ConsistencyCode0
Action Robust Reinforcement Learning and Applications in Continuous ControlCode0
Merging Decision Transformers: Weight Averaging for Forming Multi-Task PoliciesCode0
GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction EstimationCode0
Language as an Abstraction for Hierarchical Deep Reinforcement LearningCode0
Learning Generalizable Skills from Offline Multi-Task Data for Multi-Agent CooperationCode0
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?Code0
Generalized Off-Policy Actor-CriticCode0
Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement LearningCode0
Leveraging exploration in off-policy algorithms via normalizing flowsCode0
Robust Reinforcement Learning via Adversarial training with Langevin DynamicsCode0
Generalized Maximum Entropy Reinforcement Learning via Reward Shaping0
Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials0
Coagent Networks: Generalized and Scaled0
Gaussian Process Policy Optimization0
From proprioception to long-horizon planning in novel environments: A hierarchical RL model0
FP3O: Enabling Proximal Policy Optimization in Multi-Agent Cooperation with Parameter-Sharing Versatility0
Show:102550
← PrevPage 6 of 14Next →

No leaderboard results yet.