SOTAVerified

MuJoCo

Papers

Showing 501550 of 677 papers

TitleStatusHype
Fine-Tuning Offline Reinforcement Learning with Model-Based Policy Optimization0
Practical Marginalized Importance Sampling with the Successor Representation0
CAT-SAC: Soft Actor-Critic with Curiosity-Aware Entropy Temperature0
Adaptive N-step Bootstrapping with Off-policy Data0
Intrinsically Guided Exploration in Meta Reinforcement Learning0
Invariant Representations for Reinforcement Learning without Reconstruction0
TEAC: Intergrating Trust Region and Max Entropy Actor Critic for Continuous ControlCode0
PGPS : Coupling Policy Gradient with Population-based Search0
Locally Persistent Exploration in Continuous Control Tasks with Sparse RewardsCode0
OPAC: Opportunistic Actor-Critic0
Offline Imitation Learning with a Misspecified Simulator0
Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUpCode0
Weighted Entropy Modification for Soft Actor-Critic0
Proximal Policy Optimization via Enhanced Exploration Efficiency0
Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots0
Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping0
Cooperative Heterogeneous Deep Reinforcement Learning0
Can Reinforcement Learning for Continuous Control Generalize Across Physics Engines?0
Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification0
Human-guided Robot Behavior Learning: A GAN-assisted Preference-based Reinforcement Learning ApproachCode0
Self-Imitation Learning for Robot Tasks with Sparse and Delayed RewardsCode0
Balancing Constraints and Rewards with Meta-Gradient D4PG0
Hindsight Experience Replay with Kronecker Product Approximate Curvature0
Learning Intrinsic Symbolic Rewards in Reinforcement Learning0
What About Taking Policy as Input of Value Function: Policy-extended Value Function Approximator0
Population-Guided Imitation Learning0
Soft policy optimization using dual-track advantage estimator0
Constrained Markov Decision Processes via Backward Value Functions0
Adversarial Imitation Learning via Random Search0
Forward and inverse reinforcement learning sharing network weights and hyperparameters0
Overcoming Model Bias for Robust Offline Deep Reinforcement Learning0
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals0
Weak Human Preference Supervision For Deep Reinforcement LearningCode0
Learning to Play Cup-and-Ball with Noisy Camera ObservationsCode0
CoNES: Convex Natural Evolutionary Strategies0
Inverse Reinforcement Learning from a Gradient-based Learner0
Regularly Updated Deterministic Policy Gradient Algorithm0
DDPG++: Striving for Simplicity in Continuous-control Off-Policy Reinforcement Learning0
SOAC: The Soft Option Actor-Critic Architecture0
ELSIM: End-to-end learning of reusable skills through intrinsic motivation0
dm_control: Software and Tasks for Continuous Control0
Non-local Policy Optimization via Diversity-regularized Collaborative Exploration0
Continuous Control for Searching and Planning with a Learned Model0
Decorrelated Double Q-learning0
From proprioception to long-horizon planning in novel environments: A hierarchical RL model0
Primal Wasserstein Imitation LearningCode0
Cross-Domain Imitation Learning with a Dual Structure0
Gradient Monitored Reinforcement Learning0
Novel Policy Seeking with Constrained OptimizationCode0
Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning0
Show:102550
← PrevPage 11 of 14Next →

No leaderboard results yet.