SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1107611100 of 15113 papers

TitleStatusHype
General Method for Solving Four Types of SAT Problems0
General sum stochastic games with networked information flows0
Generate and Revise: Reinforcement Learning in Neural Poetry0
Generating and Evolving Reward Functions for Highway Driving with Large Language Models0
Generating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Model0
Generating Critical Scenarios for Testing Automated Driving Systems0
Generating Explanations from Deep Reinforcement Learning Using Episodic Memory0
Generating Formality-Tuned Summaries Using Input-Dependent Rewards0
Generating GPU Compiler Heuristics using Reinforcement Learning0
Generating Interpretable Fuzzy Controllers using Particle Swarm Optimization and Genetic Programming0
Generating Paraphrases with Lean Vocabulary0
Improving Factual Consistency Between a Response and Persona Facts0
Generating Rescheduling Knowledge using Reinforcement Learning in a Cognitive Architecture0
Generating Socially Acceptable Perturbations for Efficient Evaluation of Autonomous Vehicles0
Generating stable molecules using imitation and reinforcement learning0
Generating Student Feedback from Time-Series Data Using Reinforcement Learning0
Generating Text with Deep Reinforcement Learning0
Generation of Policy-Level Explanations for Reinforcement Learning0
Generative Adversarial Exploration for Reinforcement Learning0
Generative Adversarial Imagination for Sample Efficient Deep Reinforcement Learning0
Generative Adversarial Imitation Learning with Neural Networks: Global Optimality and Convergence Rate0
Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate0
Generative Adversarial Imitation Learning for End-to-End Autonomous Driving on Urban Environments0
Generative Adversarial Reward Learning for Generalized Behavior Tendency Inference0
Generative Adversarial Self-Imitation Learning0
Show:102550
← PrevPage 444 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified