SOTAVerified

Reinforcement Learning (RL)

Reinforcement Learning (RL) involves training an agent to take actions in an environment to maximize a cumulative reward signal. The agent interacts with the environment and learns by receiving feedback in the form of rewards or punishments for its actions. The goal of reinforcement learning is to find the optimal policy or decision-making strategy that maximizes the long-term reward.

Papers

Showing 12511275 of 15113 papers

TitleStatusHype
AutoPhase: Compiler Phase-Ordering for High Level Synthesis with Deep Reinforcement LearningCode1
Geometric Deep Reinforcement Learning for Dynamic DAG SchedulingCode1
Giraffe: Using Deep Reinforcement Learning to Play ChessCode1
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical InvestigationCode1
AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement LearningCode1
Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued PoliciesCode1
Visual Grounding for Object-Level Generalization in Reinforcement LearningCode1
GNN-DT: Graph Neural Network Enhanced Decision Transformer for Efficient Optimization in Dynamic EnvironmentsCode1
Advancing Multimodal Reasoning via Reinforcement Learning with Cold StartCode1
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic ReasoningCode1
Approximate information state for approximate planning and reinforcement learning in partially observed systemsCode1
Avalon: A Benchmark for RL Generalization Using Procedurally Generated WorldsCode1
Accelerating Deep Reinforcement Learning for Digital Twin Network Optimization with Evolutionary StrategiesCode1
Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning PoliciesCode1
Large Language Models are Learnable Planners for Long-Term RecommendationCode1
Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team CompositionCode1
Enhancing RL Safety with Counterfactual LLM ReasoningCode1
Environment Agnostic Representation for Visual Reinforcement LearningCode1
Execution-based Code Generation using Deep Reinforcement LearningCode1
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement LearningCode1
Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement LearningCode1
Approximating Gradients for Differentiable Quality Diversity in Reinforcement LearningCode1
Collaborative Multi-Agent Dialogue Model Training Via Reinforcement LearningCode1
Graph Constrained Reinforcement Learning for Natural Language Action SpacesCode1
Autonomous Exploration Under Uncertainty via Deep Reinforcement Learning on GraphsCode1
Show:102550
← PrevPage 51 of 605Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PPGMean Normalized Performance0.76Unverified
2PPOMean Normalized Performance0.58Unverified