SOTAVerified

Q-Learning

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Papers

Showing 150 of 1918 papers

TitleStatusHype
Flow Q-LearningCode3
ConRFT: A Reinforced Fine-tuning Method for VLA Models via Consistency PolicyCode3
Simplifying Deep Temporal Difference LearningCode3
Streaming Deep Reinforcement Learning Finally WorksCode3
rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorchCode2
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement LearningCode2
Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement LearningCode2
Digi-Q: Learning Q-Value Functions for Training Device-Control AgentsCode2
Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse WeatherCode2
Ensembling Prioritized Hybrid Policies for Multi-agent PathfindingCode2
Safe Multi-Agent Reinforcement Learning with Bilevel Optimization in Autonomous DrivingCode2
Offline RL for Natural Language Generation with Implicit Language Q LearningCode2
Pretrained LLM Adapted with LoRA as a Decision Transformer for Offline RL in Quantitative TradingCode2
ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-DependencyCode2
EpidemiOptim: A Toolbox for the Optimization of Control Policies in Epidemiological ModelsCode1
Energy-based Surprise Minimization for Multi-Agent Value FactorizationCode1
Evolution Strategies as a Scalable Alternative to Reinforcement LearningCode1
Distilling Reinforcement Learning Tricks for Video GamesCode1
DisCor: Corrective Feedback in Reinforcement Learning via Distribution CorrectionCode1
Distributed Heuristic Multi-Agent Path Finding with CommunicationCode1
Extreme Q-Learning: MaxEnt RL without EntropyCode1
Deep Reinforcement Learning-based Intelligent Traffic Signal Controls with Optimized CO2 emissionsCode1
Diffusion Policies creating a Trust Region for Offline Reinforcement LearningCode1
An Optimistic Perspective on Offline Deep Reinforcement LearningCode1
Discriminator Soft Actor Critic without Extrinsic RewardsCode1
A Recipe for Unbounded Data Augmentation in Visual Reinforcement LearningCode1
Dropout Q-Functions for Doubly Efficient Reinforcement LearningCode1
Deep Inverse Q-learning with ConstraintsCode1
Deep Reinforcement Learning with Double Q-learningCode1
A Stochastic Game Framework for Efficient Energy Management in Microgrid NetworksCode1
Continuous Deep Q-Learning with Model-based AccelerationCode1
Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via DiscretisationCode1
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement LearningCode1
Deep Reinforcement Q-Learning for Intelligent Traffic Signal Control with Partial DetectionCode1
Acting in Delayed Environments with Non-Stationary Markov PoliciesCode1
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the PastCode1
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningCode1
A Deep Reinforcement Learning Approach for Finding Non-Exploitable Strategies in Two-Player Atari GamesCode1
When should we prefer Decision Transformers for Offline Reinforcement Learning?Code1
Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman ProblemCode1
Conservative Q-Learning for Offline Reinforcement LearningCode1
Continuous control with deep reinforcement learningCode1
Reinforcement Learning in High-frequency Market MakingCode1
Deep Active Inference for Partially Observable MDPsCode1
FACMAC: Factored Multi-Agent Centralised Policy GradientsCode1
Deep Recurrent Q-Learning for Partially Observable MDPsCode1
Boosting Continuous Control with Consistency PolicyCode1
Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?Code1
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement LearningCode1
Backprop-Free Reinforcement Learning with Active Neural Generative CodingCode1
Show:102550
← PrevPage 1 of 39Next →

No leaderboard results yet.