SOTAVerified

Continuous Control

Continuous control in the context of playing games, especially within artificial intelligence (AI) and machine learning (ML), refers to the ability to make a series of smooth, ongoing adjustments or actions to control a game or a simulation. This is in contrast to discrete control, where the actions are limited to a set of specific, distinct choices. Continuous control is crucial in environments where precision, timing, and the magnitude of actions matter, such as driving a car in a racing game, controlling a character in a simulation, or managing the flight of an aircraft in a flight simulator.

Papers

Showing 10261050 of 1161 papers

TitleStatusHype
The Distributional Reward Critic Framework for Reinforcement Learning Under Perturbed RewardsCode0
Meta-Q-LearningCode0
Meta reinforcement learning as task inferenceCode0
A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert ObservationsCode0
Minimax Optimal Online Imitation Learning via Replay EstimationCode0
Clipped Action Policy GradientCode0
Foresee then Evaluate: Decomposing Value Estimation with Latent Future PredictionCode0
FlowPG: Action-constrained Policy Gradient with Normalizing FlowsCode0
FI-ODE: Certifiably Robust Forward Invariance in Neural ODEsCode0
Mixed-Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Electric Vehicle Energy ManagementCode0
Real-Time Reinforcement LearningCode0
Understanding the Evolution of Linear Regions in Deep Reinforcement LearningCode0
Model-Advantage and Value-Aware Models for Model-Based Reinforcement Learning: Bridging the Gap in Theory and PracticeCode0
Correct-by-construction reach-avoid control of partially observable linear stochastic systemsCode0
Deep Reinforcement Learning that MattersCode0
Reconciling Spatial and Temporal Abstractions for Goal RepresentationCode0
Exploring reinforcement learning techniques for discrete and continuous control tasks in the MuJoCo environmentCode0
ACRE: Actor-Critic with Reward-Preserving ExplorationCode0
Exploration in Action SpaceCode0
Explaining RL Decisions with TrajectoriesCode0
Decoupling Meta-Reinforcement Learning with Gaussian Task Contexts and SkillsCode0
Evolution-Guided Policy Gradient in Reinforcement LearningCode0
Model-Ensemble Trust-Region Policy OptimizationCode0
VIME: Variational Information Maximizing ExplorationCode0
Simple random search of static linear policies is competitive for reinforcement learningCode0
Show:102550
← PrevPage 42 of 47Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn3,459Unverified
2TD3 gSDEReturn3,267Unverified
3TD3Return2,865Unverified
4SACReturn2,859Unverified
5PPO gSDEReturn2,587Unverified
6A2C gSDEReturn2,560Unverified
7PPOReturn2,160Unverified
8A2CReturn1,967Unverified
#ModelMetricClaimedVerifiedStatus
1SACReturn2,883Unverified
2SAC gSDEReturn2,850Unverified
3PPO + gSDEReturn2,760Unverified
4TD3Return2,687Unverified
5TD3 gSDEReturn2,578Unverified
6PPOReturn2,254Unverified
7A2C + gSDEReturn2,028Unverified
8A2CReturn1,652Unverified
#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn2,646Unverified
2PPO gSDEReturn2,508Unverified
3SACReturn2,477Unverified
4TD3Return2,470Unverified
5TD3 gSDEReturn2,353Unverified
6PPOReturn1,622Unverified
7A2CReturn1,559Unverified
8A2C gSDEReturn1,448Unverified
#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn2,341Unverified
2SACReturn2,215Unverified
3TD3Return2,106Unverified
4TD3 gSDEReturn1,989Unverified
5PPO gSDEReturn1,776Unverified
6PPOReturn1,238Unverified
7A2C gSDEReturn694Unverified
8A2CReturn443Unverified
#ModelMetricClaimedVerifiedStatus
1DreamerV1Return800Unverified
2SLACReturn700Unverified
3DrQReturn660Unverified
4PlaNetReturn650Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn998.14Unverified
2DREAMERReturn853Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn868.87Unverified
2MuZero UnpluggedReturn594.3Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn914.39Unverified
2MuZero UnpluggedReturn869.9Unverified
#ModelMetricClaimedVerifiedStatus
1DrQReturn963Unverified
2PlaNetReturn914Unverified
#ModelMetricClaimedVerifiedStatus
1DrQReturn921Unverified
2PlaNetReturn890Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn963.07Unverified
2MuZero UnpluggedReturn759Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn987.79Unverified
2MuZero UnpluggedReturn887.2Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn975.46Unverified
2MuZero UnpluggedReturn949.5Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,353.8Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-326Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-83.3Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-149.6Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn417.52Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-170.9Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore730.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-0.4Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn977.38Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore769Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore959Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn984.86Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4,869.8Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore960.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore606.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore980.3Unverified
#ModelMetricClaimedVerifiedStatus
1MACScore178.3Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore582Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore841Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn846.91Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore299Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore518Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4,412.4Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn986.38Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore767Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore926Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn972.53Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn681.6Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore287Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,914Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,183.3Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn528.24Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn926.5Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn643.1Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore247.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4.5Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore10.4Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore14.1Unverified
#ModelMetricClaimedVerifiedStatus
1MACScore163.5Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn659.2Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn556Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-61.7Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-64.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-60.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-61.6Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn837.76Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn923.54Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn933.77Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn982.26Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore538Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore929Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn971.53Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore269.7Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore96Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn931.06Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore403Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore902Unverified