SOTAVerified

MuJoCo

Papers

Showing 151175 of 677 papers

TitleStatusHype
Entropy Augmented Reinforcement Learning0
CasIL: Cognizing and Imitating Skills via a Dual Cognition-Action Architecture0
Careful at Estimation and Bold at Exploration0
Can Reinforcement Learning for Continuous Control Generalize Across Physics Engines?0
CAMEL: Continuous Action Masking Enabled by Large Language Models for Reinforcement Learning0
A Computational Theory of Learning Flexible Reward-Seeking Behavior with Place Cells0
A Tractable Inference Perspective of Offline RL0
CAE: Repurposing the Critic as an Explorer in Deep Reinforcement Learning0
Multiagent Model-based Credit Assignment for Continuous Control0
Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains0
Bridging Physics-Informed Neural Networks with Reinforcement Learning: Hamilton-Jacobi-Bellman Proximal Policy Optimization (HJBPPO)0
Adaptive Q-Network: On-the-fly Target Selection for Deep Reinforcement Learning0
ELSIM: End-to-end learning of reusable skills through intrinsic motivation0
An Intelligent Social Learning-based Optimization Strategy for Black-box Robotic Control with Reinforcement Learning0
Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration0
A Computational Model of Learning Flexible Navigation in a Maze by Layout-Conforming Replay of Place Cells0
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling0
Boosting Exploration in Actor-Critic Algorithms by Incentivizing Plausible Novel States0
BlockPuzzle - A Challenge in Physical Reasoning and Generalization for Robot Learning0
Sim2Sim Evaluation of a Novel Data-Efficient Differentiable Physics Engine for Tensegrity Robots0
Adaptive N-step Bootstrapping with Off-policy Data0
Biased Estimates of Advantages over Path Ensembles0
Accelerating Self-Imitation Learning from Demonstrations via Policy Constraints and Q-Ensemble0
Effects of sparse rewards of different magnitudes in the speed of learning of model-based actor critic methods0
An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients0
Show:102550
← PrevPage 7 of 28Next →

No leaderboard results yet.