SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 89018950 of 15113 papers

TitleStatusHype
Regularizing Trajectory Optimization with Denoising Autoencoders0
Regulating Reward Training by Means of Certainty Prediction in a Neural Network-Implemented Pong Game0
REIN-2: Giving Birth to Prepared Reinforcement Learning Agents Using Reinforcement Learning Agents0
ReinDSplit: Reinforced Dynamic Split Learning for Pest Recognition in Precision Agriculture0
ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning0
Reinforce Attack: Adversarial Attack against BERT with Reinforcement Learning0
Reinforced Anytime Bottom Up Rule Learning for Knowledge Graph Completion0
Reinforced Bit Allocation under Task-Driven Semantic Distortion Metrics0
Multi-Agent Reinforcement Learning for Network Load Balancing in Data Center0
Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation0
Reinforced dynamics for enhanced sampling in large atomic and molecular systems0
Reinforced Extractive Summarization with Question-Focused Rewards0
Reinforced Hybrid Genetic Algorithm for the Traveling Salesman Problem0
Reinforced Imitation in Heterogeneous Action Space0
Reinforced Imitation Learning by Free Energy Principle0
Reinforced Inverse Scattering0
Reinforced Labels: Multi-Agent Deep Reinforcement Learning for Point-Feature Label Placement0
Reinforced Latent Reasoning for LLM-based Recommendation0
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models0
Reinforced MOOCs Concept Recommendation in Heterogeneous Information Networks0
Reinforced Multi-task Approach for Multi-hop Question Generation0
Reinforced Pedestrian Attribute Recognition with Group Optimization Reward0
Reinforced Self-Training (ReST) for Language Modeling0
Reinforced Training Data Selection for Domain Adaptation0
Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API0
Reinforced Video Captioning with Entailment Rewards0
Reinforced Workload Distribution Fairness0
Reinforcement and Imitation Learning via Interactive No-Regret Learning0
Reinforcement-based frugal learning for satellite image change detection0
Reinforcement Evolutionary Learning Method for self-learning0
Reinforcement Explanation Learning0
Reinforcement Leaning for Infinite-Dimensional Systems0
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound0
Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies0
Reinforcement Learning Agent Design and Optimization with Bandwidth Allocation Model0
Reinforcement Learning Agents for Ubisoft's Roller Champions0
Reinforcement Learning Agent Training with Goals for Real World Tasks0
Reinforcement Learning Algorithm for Traffic Steering in Heterogeneous Network0
Reinforcement Learning Algorithms: An Overview and Classification0
Reinforcement Learning Algorithm Selection0
Reinforcement Learning algorithms for regret minimization in structured Markov Decision Processes0
Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook0
Reinforcement learning and Bayesian data assimilation for model-informed precision dosing in oncology0
Towards interpretable quantum machine learning via single-photon quantum walks0
Reinforcement Learning and Deep Stochastic Optimal Control for Final Quadratic Hedging0
Reinforcement Learning and Graph Neural Networks for Probabilistic Risk Assessment0
Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 20
Reinforcement Learning and Mixed-Integer Programming for Power Plant Scheduling in Low Carbon Systems: Comparison and Hybridisation0
Reinforcement Learning and Nonparametric Detection of Game-Theoretic Equilibrium Play in Social Networks0
Reinforcement Learning and Video Games0
Show:102550
← PrevPage 179 of 303Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified