SOTAVerified

MuJoCo

Papers

Showing 151200 of 677 papers

TitleStatusHype
Phasic Diversity Optimization for Population-Based Reinforcement Learning0
Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning0
DeepSafeMPC: Deep Learning-Based Model Predictive Control for Safe Multi-Agent Reinforcement Learning0
Conservative DDPG -- Pessimistic RL without Ensemble0
Iterated Q-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning0
Continuous Mean-Zero Disagreement-Regularized Imitation Learning (CMZ-DRIL)0
Snapshot Reinforcement Learning: Leveraging Prior Trajectories for EfficiencyCode0
C-GAIL: Stabilizing Generative Adversarial Imitation Learning with Control Theory0
Beyond Worst-case Attacks: Robust RL with Adaptive Defense via Non-dominated PoliciesCode0
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary DynamicsCode0
Learn to Teach: Sample-Efficient Privileged Learning for Humanoid Locomotion over Diverse Terrains0
ALOHA 2: An Enhanced Low-Cost Hardware for Bimanual Teleoperation0
Compressing Deep Reinforcement Learning Networks with a Dynamic Structured Pruning Method for Autonomous Driving0
Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space InferenceCode1
Accelerating Inverse Reinforcement Learning with Expert Bootstrapping0
SQT -- std Q-target0
MinMaxMin Q-learning0
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation LearningCode0
A Reinforcement Learning Based Controller to Minimize Forces on the Crutches of a Lower-Limb Exoskeleton0
Extrinsicaly Rewarded Soft Q Imitation Learning with Discriminator0
Simple Policy OptimizationCode2
Episodic Reinforcement Learning with Expanded State-reward Space0
AgentMixer: Multi-Agent Correlated Policy Factorization0
Neural Population Learning beyond Symmetric Zero-sum Games0
An Invariant Information Geometric Method for High-Dimensional Online OptimizationCode0
Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction0
Adaptive trajectory-constrained exploration strategy for deep reinforcement learningCode0
Efficient Reinforcement Learning via Decoupling Exploration and UtilizationCode1
XuanCe: A Comprehensive and Unified Deep Reinforcement Learning LibraryCode3
DexDLO: Learning Goal-Conditioned Dexterous Policy for Dynamic Manipulation of Deformable Linear Objects0
OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments0
GO-DICE: Goal-Conditioned Option-Aware Offline Imitation Learning via Stationary Distribution Correction EstimationCode0
Small Dataset, Big Gains: Enhancing Reinforcement Learning by Offline Pre-Training with Model Based Augmentation0
World Models via Policy-Guided Trajectory DiffusionCode1
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement LearningCode0
A dynamical clipping approach with task feedback for Proximal Policy OptimizationCode0
Similarity-based Knowledge Transfer for Cross-Domain Reinforcement Learning0
Supported Trust Region Optimization for Offline Reinforcement Learning0
On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling0
An Intelligent Social Learning-based Optimization Strategy for Black-box Robotic Control with Reinforcement Learning0
Optimistic Multi-Agent Policy GradientCode1
Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula0
A Tractable Inference Perspective of Offline RL0
Good Better Best: Self-Motivated Imitation Learning for noisy Demonstrations0
Mind the Model, Not the Agent: The Primacy Bias in Model-based RL0
Policy Gradient with Kernel Quadrature0
One is More: Diverse Perspectives within a Single Network for Efficient DRL0
Vision-Language Models are Zero-Shot Reward Models for Reinforcement LearningCode1
Benchmarking the Sim-to-Real Gap in Cloth Manipulation0
LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision ScenariosCode0
Show:102550
← PrevPage 4 of 14Next →

No leaderboard results yet.