SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 55515600 of 15113 papers

TitleStatusHype
Designing Rewards for Fast Learning0
Reinforcement Learning with a TerminatorCode0
Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance LearningCode0
Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets0
Residual Q-Networks for Value Function Factorizing in Multi-Agent Reinforcement Learning0
Stock Trading Optimization through Model-based Reinforcement Learning with Resistance Support Relative Strength0
Multi-Agent Reinforcement Learning is a Sequence Modeling ProblemCode2
SEREN: Knowing When to Explore and When to Exploit0
RLx2: Training a Sparse Deep Reinforcement Learning Model from ScratchCode1
Learning Open Domain Multi-hop Search Using Reinforcement Learning0
Efficient Reward Poisoning Attacks on Online Deep Reinforcement LearningCode0
GraMeR: Graph Meta Reinforcement Learning for Multi-Objective Influence Maximization0
Learning Security Strategies through Game Play and Optimal Stopping0
Provable Benefits of Representational Transfer in Reinforcement LearningCode1
On the Robustness of Safe Reinforcement Learning under Observational PerturbationsCode1
Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning0
Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective TrajectoriesCode1
Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming ChallengesCode1
Survival Analysis on Structured Data using Deep Reinforcement Learning0
Multi-Source Transfer Learning for Deep Model-Based Reinforcement Learning0
Tutorial on Course-of-Action (COA) Attack Search Methods in Computer Networks0
Off-Beat Multi-Agent Reinforcement Learning0
Non-Markovian policies occupancy measures0
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence Matters0
Provably Sample-Efficient RL with Side Information about Latent Dynamics0
Learning to Solve Combinatorial Graph Partitioning Problems via Efficient ExplorationCode1
GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis0
Deep Reinforcement Learning for Distributed and Uncoordinated Cognitive Radios Resource Allocation0
IGLU 2022: Interactive Grounded Language Understanding in a Collaborative Environment at NeurIPS 2022Code0
Double Deep Q Networks for Sensor Management in Space Situational Awareness0
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal0
FedFormer: Contextual Federation with Attention in Reinforcement LearningCode1
Feudal Multi-Agent Reinforcement Learning with Adaptive Network Partition for Traffic Signal Control0
Does DQN Learn?0
DRLComplex: Reconstruction of protein quaternary structures using deep reinforcement learningCode1
Dynamic Network Reconfiguration for Entropy Maximization using Deep Reinforcement LearningCode0
Reinforcement Learning Approach for Mapping Applications to Dataflow-Based Coarse-Grained Reconfigurable ArrayCode0
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes0
Physics-Guided Hierarchical Reward Mechanism for Learning-Based Robotic Grasping0
SFP: State-free Priors for Exploration in Off-Policy Reinforcement Learning0
Unsupervised Reinforcement Adaptation for Class-Imbalanced Text ClassificationCode0
RACE: A Reinforcement Learning Framework for Improved Adaptive Control of NoC Channel Buffers0
Embed to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency0
A Fair Federated Learning Framework With Reinforcement Learning0
Constrained Reinforcement Learning for Short Video Recommendation0
Scalable Multi-Agent Model-Based Reinforcement LearningCode1
Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function0
Multimodal Knowledge Alignment with Reinforcement LearningCode1
Near-Optimal Goal-Oriented Reinforcement Learning in Non-Stationary Environments0
RLPrompt: Optimizing Discrete Text Prompts with Reinforcement LearningCode2
Show:102550
← PrevPage 112 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified