SOTAVerified

Continuous Control

Continuous control in the context of playing games, especially within artificial intelligence (AI) and machine learning (ML), refers to the ability to make a series of smooth, ongoing adjustments or actions to control a game or a simulation. This is in contrast to discrete control, where the actions are limited to a set of specific, distinct choices. Continuous control is crucial in environments where precision, timing, and the magnitude of actions matter, such as driving a car in a racing game, controlling a character in a simulation, or managing the flight of an aircraft in a flight simulator.

Papers

Showing 151200 of 1161 papers

TitleStatusHype
Mollification Effects of Policy Gradient Methods0
Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous controlCode2
Diffusion-based Reinforcement Learning via Q-weighted Variational Policy OptimizationCode2
OLLIE: Imitation Learning from Offline Pretraining to Online FinetuningCode1
How to Leverage Diverse Demonstrations in Offline Imitation LearningCode1
Offline Reinforcement Learning from Datasets with Structured Non-StationarityCode0
Investigating the Impact of Choice on Deep Reinforcement Learning for Space Controls0
Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning0
The Curse of Diversity in Ensemble-Based ExplorationCode0
CTD4 -- A Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple CriticsCode0
Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning0
REBEL: Reinforcement Learning via Regressing Relative RewardsCode2
AFU: Actor-Free critic Updates in off-policy RL for continuous controlCode0
Explicit Lipschitz Value Estimation Enhances Policy Robustness Against Perturbation0
On the stability of Lipschitz continuous control problems and its application to reinforcement learning0
Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman EquationCode0
LTL-Constrained Policy Optimization with Cycle Experience Replay0
Continuous Control Reinforcement Learning: Distributed Distributional DrQ Algorithms0
NoiseNCA: Noisy Seed Improves Spatio-Temporal Continuity of Neural Cellular Automata0
Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution0
Decision Transformer as a Foundation Model for Partially Observable Continuous Control0
Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration0
Reinforcement Learning from Delayed Observations via World ModelsCode0
Demystifying the Physics of Deep Reinforcement Learning-Based Autonomous Vehicle Decision-Making0
Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features CriticsCode1
Online Policy Learning from Offline Preferences0
Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning0
Sample-Optimal Zero-Violation Safety For Continuous Control0
Noisy Spiking Actor Network for Exploration0
SplAgger: Split Aggregation for Meta-Reinforcement LearningCode1
Iterated Q-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning0
EfficientZero V2: Mastering Discrete and Continuous Control with Limited DataCode2
A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert ObservationsCode0
DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning0
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization0
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in ControlCode1
Dataset Clustering for Improved Offline Policy LearningCode0
Exploiting Estimation Bias in Clipped Double Q-Learning for Continous Control Reinforcement Learning Tasks0
Hybrid Inverse Reinforcement LearningCode1
Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive LossCode1
FedAA: A Reinforcement Learning Perspective on Adaptive Aggregation for Fair and Robust Federated LearningCode1
Offline Actor-Critic Reinforcement Learning Scales to Large Models0
Differentially Private Deep Model-Based Reinforcement Learning0
FlowPG: Action-constrained Policy Gradient with Normalizing FlowsCode0
Learning Diverse Policies with Soft Self-Generated Guidance0
Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning AgentsCode0
Frugal Actor-Critic: Sample Efficient Off-Policy Deep Reinforcement Learning Using Unique Experiences0
Deep Exploration with PAC-Bayes0
Understanding What Affects the Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence0
A Strategy for Preparing Quantum Squeezed States Using Reinforcement Learning0
Show:102550
← PrevPage 4 of 24Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn3,459Unverified
2TD3 gSDEReturn3,267Unverified
3TD3Return2,865Unverified
4SACReturn2,859Unverified
5PPO gSDEReturn2,587Unverified
6A2C gSDEReturn2,560Unverified
7PPOReturn2,160Unverified
8A2CReturn1,967Unverified
#ModelMetricClaimedVerifiedStatus
1SACReturn2,883Unverified
2SAC gSDEReturn2,850Unverified
3PPO + gSDEReturn2,760Unverified
4TD3Return2,687Unverified
5TD3 gSDEReturn2,578Unverified
6PPOReturn2,254Unverified
7A2C + gSDEReturn2,028Unverified
8A2CReturn1,652Unverified
#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn2,646Unverified
2PPO gSDEReturn2,508Unverified
3SACReturn2,477Unverified
4TD3Return2,470Unverified
5TD3 gSDEReturn2,353Unverified
6PPOReturn1,622Unverified
7A2CReturn1,559Unverified
8A2C gSDEReturn1,448Unverified
#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn2,341Unverified
2SACReturn2,215Unverified
3TD3Return2,106Unverified
4TD3 gSDEReturn1,989Unverified
5PPO gSDEReturn1,776Unverified
6PPOReturn1,238Unverified
7A2C gSDEReturn694Unverified
8A2CReturn443Unverified
#ModelMetricClaimedVerifiedStatus
1DreamerV1Return800Unverified
2SLACReturn700Unverified
3DrQReturn660Unverified
4PlaNetReturn650Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn998.14Unverified
2DREAMERReturn853Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn868.87Unverified
2MuZero UnpluggedReturn594.3Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn914.39Unverified
2MuZero UnpluggedReturn869.9Unverified
#ModelMetricClaimedVerifiedStatus
1DrQReturn963Unverified
2PlaNetReturn914Unverified
#ModelMetricClaimedVerifiedStatus
1DrQReturn921Unverified
2PlaNetReturn890Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn963.07Unverified
2MuZero UnpluggedReturn759Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn987.79Unverified
2MuZero UnpluggedReturn887.2Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn975.46Unverified
2MuZero UnpluggedReturn949.5Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,353.8Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-326Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-83.3Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-149.6Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn417.52Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-170.9Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore730.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-0.4Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn977.38Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore769Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore959Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn984.86Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4,869.8Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore960.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore606.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore980.3Unverified
#ModelMetricClaimedVerifiedStatus
1MACScore178.3Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore582Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore841Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn846.91Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore299Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore518Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4,412.4Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn986.38Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore767Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore926Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn972.53Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn681.6Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore287Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,914Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,183.3Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn528.24Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn926.5Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn643.1Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore247.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4.5Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore10.4Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore14.1Unverified
#ModelMetricClaimedVerifiedStatus
1MACScore163.5Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn659.2Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn556Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-61.7Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-64.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-60.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-61.6Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn837.76Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn923.54Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn933.77Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn982.26Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore538Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore929Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn971.53Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore269.7Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore96Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn931.06Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore403Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore902Unverified