SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 1460114650 of 15113 papers

TitleStatusHype
DARLA: Improving Zero-Shot Transfer in Reinforcement LearningCode0
d3rlpy: An Offline Deep Reinforcement Learning LibraryCode0
Ask Before You Act: Generalising to Novel Environments by Asking QuestionsCode0
Active Collection of Well-Being and Health Data in Mobile DevicesCode0
Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement LearningCode0
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy GamesCode0
LExCI: A Framework for Reinforcement Learning with Embedded SystemsCode0
BQSched: A Non-intrusive Scheduler for Batch Concurrent Queries via Reinforcement LearningCode0
ELO-Rated Sequence Rewards: Advancing Reinforcement Learning ModelsCode0
Improved Off-policy Reinforcement Learning in Biological Sequence DesignCode0
A Generalised and Adaptable Reinforcement Learning Stopping MethodCode0
A dynamical clipping approach with task feedback for Proximal Policy OptimizationCode0
Generative Adversarial Network for Abstractive Text SummarizationCode0
Learning Explicit Credit Assignment for Cooperative Multi-Agent Reinforcement Learning via Polarization Policy GradientCode0
Embodied Question AnsweringCode0
CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++Code0
Cyclic Policy Distillation: Sample-Efficient Sim-to-Real Reinforcement Learning with Domain RandomizationCode0
A Bayesian Approach to Reinforcement Learning of Vision-Based Vehicular ControlCode0
Learning Curriculum Policies for Reinforcement LearningCode0
Generative Adversarial User Model for Reinforcement Learning Based Recommendation SystemCode0
Bounding the Optimal Value Function in Compositional Reinforcement LearningCode0
Emergence of Compositional Language with Deep Generational TransmissionCode0
Visual Exploration and Energy-aware Path Planning via Reinforcement LearningCode0
Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networksCode0
Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel InputCode0
Learning data augmentation policies using augmented random searchCode0
Cycle-of-Learning for Autonomous Systems from Human InteractionCode0
Emergence of Pragmatics from Referential Game between Theory of Mind AgentsCode0
Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to ATARI gamesCode0
Improved Sample Complexity Bounds for Distributionally Robust Reinforcement LearningCode0
A General Framework for Structured Learning of Mechanical SystemsCode0
Curriculum RL meets Monte Carlo Planning: Optimization of a Real World Container Management ProblemCode0
AACHER: Assorted Actor-Critic Deep Reinforcement Learning with Hindsight Experience ReplayCode0
Curriculum Design for Teaching via Demonstrations: Theory and ApplicationsCode0
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment DesignCode0
Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement LearningCode0
Harnessing Reinforcement Learning for Neural Motion PlanningCode0
Emergent Dominance Hierarchies in Reinforcement Learning AgentsCode0
Generative Planning for Temporally Coordinated Exploration in Reinforcement LearningCode0
Emergent Linguistic Phenomena in Multi-Agent Communication GamesCode0
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement LearningCode0
Curious Exploration and Return-based Memory Restoration for Deep Reinforcement LearningCode0
Generative Question Refinement with Deep Reinforcement Learning in Retrieval-based QA SystemCode0
EMI: Exploration with Mutual InformationCode0
Active Advantage-Aligned Online Reinforcement Learning with Offline DataCode0
EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-TuningCode0
Learning Sparse Rewarded Tasks from Sub-Optimal DemonstrationsCode0
Generic Itemset Mining Based on Reinforcement LearningCode0
Genes in Intelligent AgentsCode0
Network Randomization: A Simple Technique for Generalization in Deep Reinforcement LearningCode0
Show:102550
← PrevPage 293 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified