SOTAVerified

Continuous Control

Continuous control in the context of playing games, especially within artificial intelligence (AI) and machine learning (ML), refers to the ability to make a series of smooth, ongoing adjustments or actions to control a game or a simulation. This is in contrast to discrete control, where the actions are limited to a set of specific, distinct choices. Continuous control is crucial in environments where precision, timing, and the magnitude of actions matter, such as driving a car in a racing game, controlling a character in a simulation, or managing the flight of an aircraft in a flight simulator.

Papers

Showing 301350 of 1161 papers

TitleStatusHype
Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical GuaranteesCode0
Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation PredictionCode0
Memory-based control with recurrent neural networksCode0
Model-Advantage and Value-Aware Models for Model-Based Reinforcement Learning: Bridging the Gap in Theory and PracticeCode0
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement LearningCode0
Live in the Moment: Learning Dynamics Model Adapted to Evolving PolicyCode0
Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement LearningCode0
ED2: Environment Dynamics Decomposition World Models for Continuous ControlCode0
Learning with Expert Abstractions for Efficient Multi-Task Continuous ControlCode0
Locally Persistent Exploration in Continuous Control Tasks with Sparse RewardsCode0
Learning State Abstractions for Transfer in Continuous ControlCode0
Learning Stabilizable Nonlinear Dynamics with Contraction-Based RegularizationCode0
Learning State Representations via Retracing in Reinforcement LearningCode0
COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven ExplorationCode0
DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under UncertaintyCode0
Learning Sparse Rewarded Tasks from Sub-Optimal DemonstrationsCode0
Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning AgentsCode0
Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex EnvironmentsCode0
Learning model-based planning from scratchCode0
Learning Provably Stabilizing Neural Controllers for Discrete-Time Stochastic SystemsCode0
Learning Continuous Control Policies by Stochastic Value GradientsCode0
AFU: Actor-Free critic Updates in off-policy RL for continuous controlCode0
Learning Continuous Control Policies for Information-Theoretic Active PerceptionCode0
Learning Action-Transferable Policy with Action EmbeddingCode0
Adaptive Diffusion Policy Optimization for Robotic ManipulationCode0
Learning-Based Model Predictive Control for Piecewise Affine Systems with Feasibility GuaranteesCode0
DNS: Determinantal Point Process Based Neural Network Sampler for Ensemble Reinforcement LearningCode0
Learning Belief Representations for Imitation Learning in POMDPsCode0
Clipped Action Policy GradientCode0
C-Learning: Horizon-Aware Cumulative Accessibility EstimationCode0
Adversarial Skill Networks: Unsupervised Robot Skill Learning from VideoCode0
Learning Bellman Complete Representations for Offline Policy EvaluationCode0
Diversity-Enriched Option-CriticCode0
Driving in Dense Traffic with Model-Free Reinforcement LearningCode0
Improving Value Estimation Critically Enhances Vanilla Policy GradientCode0
Learning Diverse Options via InfoMax Termination CriticCode0
A State-Distribution Matching Approach to Non-Episodic Reinforcement LearningCode0
Dual Policy DistillationCode0
DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPsCode0
Collaborative Evolutionary Reinforcement LearningCode0
A Tour of Reinforcement Learning: The View from Continuous ControlCode0
Dynamics-aware EmbeddingsCode0
Imitation Learning by State-Only Distribution MatchingCode0
Information Theoretic Regret Bounds for Online Nonlinear ControlCode0
A Simple Decentralized Cross-Entropy MethodCode0
CIE: Controlling Language Model Text Generations Using Continuous SignalsCode0
Leveraging exploration in off-policy algorithms via normalizing flowsCode0
Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation LearningCode0
Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control OptimizationCode0
Adversarial Policy Optimization for Offline Preference-based Reinforcement LearningCode0
Show:102550
← PrevPage 7 of 24Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn3,459Unverified
2TD3 gSDEReturn3,267Unverified
3TD3Return2,865Unverified
4SACReturn2,859Unverified
5PPO gSDEReturn2,587Unverified
6A2C gSDEReturn2,560Unverified
7PPOReturn2,160Unverified
8A2CReturn1,967Unverified
#ModelMetricClaimedVerifiedStatus
1SACReturn2,883Unverified
2SAC gSDEReturn2,850Unverified
3PPO + gSDEReturn2,760Unverified
4TD3Return2,687Unverified
5TD3 gSDEReturn2,578Unverified
6PPOReturn2,254Unverified
7A2C + gSDEReturn2,028Unverified
8A2CReturn1,652Unverified
#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn2,646Unverified
2PPO gSDEReturn2,508Unverified
3SACReturn2,477Unverified
4TD3Return2,470Unverified
5TD3 gSDEReturn2,353Unverified
6PPOReturn1,622Unverified
7A2CReturn1,559Unverified
8A2C gSDEReturn1,448Unverified
#ModelMetricClaimedVerifiedStatus
1SAC gSDEReturn2,341Unverified
2SACReturn2,215Unverified
3TD3Return2,106Unverified
4TD3 gSDEReturn1,989Unverified
5PPO gSDEReturn1,776Unverified
6PPOReturn1,238Unverified
7A2C gSDEReturn694Unverified
8A2CReturn443Unverified
#ModelMetricClaimedVerifiedStatus
1DreamerV1Return800Unverified
2SLACReturn700Unverified
3DrQReturn660Unverified
4PlaNetReturn650Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn998.14Unverified
2DREAMERReturn853Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn868.87Unverified
2MuZero UnpluggedReturn594.3Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn914.39Unverified
2MuZero UnpluggedReturn869.9Unverified
#ModelMetricClaimedVerifiedStatus
1DrQReturn963Unverified
2PlaNetReturn914Unverified
#ModelMetricClaimedVerifiedStatus
1DrQReturn921Unverified
2PlaNetReturn890Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn963.07Unverified
2MuZero UnpluggedReturn759Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn987.79Unverified
2MuZero UnpluggedReturn887.2Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn975.46Unverified
2MuZero UnpluggedReturn949.5Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,353.8Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-326Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-83.3Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-149.6Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn417.52Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-170.9Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore730.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-0.4Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn977.38Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore769Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore959Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn984.86Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4,869.8Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore960.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore606.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore980.3Unverified
#ModelMetricClaimedVerifiedStatus
1MACScore178.3Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore582Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore841Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn846.91Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore299Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore518Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4,412.4Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn986.38Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore767Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore926Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn972.53Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn681.6Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore287Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,914Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore1,183.3Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn528.24Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn926.5Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn643.1Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore247.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore4.5Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore10.4Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore14.1Unverified
#ModelMetricClaimedVerifiedStatus
1MACScore163.5Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn659.2Unverified
#ModelMetricClaimedVerifiedStatus
1MuZero UnpluggedReturn556Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-61.7Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-64.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-60.2Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore-61.6Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn837.76Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn923.54Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn933.77Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn982.26Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore538Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore929Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn971.53Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore269.7Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore96Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1TRPOScore0Unverified
#ModelMetricClaimedVerifiedStatus
1SMuZeroReturn931.06Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore403Unverified
#ModelMetricClaimedVerifiedStatus
1CURLScore902Unverified