Multiagent Cooperation and Competition with Deep Reinforcement Learning Nov 27, 2015 Deep Reinforcement Learning Q-Learning
Code Code Available 1Prioritized Experience Replay Nov 18, 2015 Atari Games reinforcement-learning
Code Code Available 1Deep Reinforcement Learning in Parameterized Action Space Nov 13, 2015 Deep Reinforcement Learning reinforcement-learning
Code Code Available 1Deep Reinforcement Learning with Double Q-learning Sep 22, 2015 Atari Games Deep Reinforcement Learning
Code Code Available 1Continuous control with deep reinforcement learning Sep 9, 2015 Action Detection continuous-control
Code Code Available 1Giraffe: Using Deep Reinforcement Learning to Play Chess Sep 4, 2015 BIG-bench Machine Learning Deep Reinforcement Learning
Code Code Available 1Weight Uncertainty in Neural Networks May 20, 2015 Bayesian Inference General Classification
Code Code Available 1Optimizing the CVaR via Sampling Apr 15, 2014 reinforcement-learning Reinforcement Learning
Code Code Available 1Scalable Planning and Learning for Multiagent POMDPs: Extended Version Apr 4, 2014 reinforcement-learning Reinforcement Learning
Code Code Available 1Off-Policy General Value Functions to Represent Dynamic Role Assignments in RoboCup 3D Soccer Simulation Feb 18, 2014 Reinforcement Learning Reinforcement Learning (RL)
Code Code Available 1Playing Atari with Deep Reinforcement Learning Dec 19, 2013 Atari Games Deep Reinforcement Learning
Code Code Available 1Artist Agent: A Reinforcement Learning Approach to Automatic Stroke Generation in Oriental Ink Painting Jun 18, 2012 reinforcement-learning Reinforcement Learning (RL)
Code Code Available 1Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved) Jul 17, 2025 continuous-control Continuous Control
— Unverified 0Aligning Humans and Robots via Reinforcement Learning from Implicit Human Feedback Jul 17, 2025 EEG MuJoCo
— Unverified 0From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning Jul 17, 2025 D4RL Offline RL
— Unverified 0Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities Jul 17, 2025 Language Modeling Language Modelling
— Unverified 0VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks Jul 17, 2025 Math Mathematical Reasoning
— Unverified 0QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation Jul 17, 2025 Math Reinforcement Learning (RL)
— Unverified 0Kevin: Multi-Turn RL for Generating CUDA Kernels Jul 16, 2025 GPU Reinforcement Learning (RL)
— Unverified 0Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training Jul 16, 2025 Code Generation Math
— Unverified 0Fly, Fail, Fix: Iterative Game Repair with Reinforcement Learning and Large Multimodal Models Jul 16, 2025 Game Design Reinforcement Learning (RL)
— Unverified 0Exploring the robustness of TractOracle methods in RL-based tractography Jul 15, 2025 Diffusion MRI reinforcement-learning
Code Code Available 0Real-Time Bayesian Detection of Drift-Evasive GNSS Spoofing in Reinforcement Learning Based UAV Deconfliction Jul 15, 2025 Change Point Detection Reinforcement Learning (RL)
— Unverified 0Illuminating the Three Dogmas of Reinforcement Learning under Evolutionary Light Jul 15, 2025 Reinforcement Learning (RL)
— Unverified 0Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing Jul 15, 2025 Knowledge Tracing Math
Code Code Available 0Bridging the Gap in Vision Language Models in Identifying Unsafe Concepts Across Modalities Jul 15, 2025 Reinforcement Learning (RL)
Code Code Available 0High-Throughput Distributed Reinforcement Learning via Adaptive Policy Synchronization Jul 15, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 0Local Pairwise Distance Matching for Backpropagation-Free Reinforcement Learning Jul 15, 2025 Policy Gradient Methods reinforcement-learning
— Unverified 0The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs Jul 10, 2025 Multimodal Reasoning Reinforcement Learning (RL)
— Unverified 0Scaling RL to Long Videos Jul 10, 2025 Reinforcement Learning (RL) Spatial Reasoning
— Unverified 0Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning Jul 9, 2025 Reinforcement Learning (RL)
— Unverified 0Squeeze the Soaked Sponge: Efficient Off-policy Reinforcement Finetuning for Large Language Model Jul 9, 2025 Language Modeling Language Modelling
— Unverified 0Safe Domain Randomization via Uncertainty-Aware Out-of-Distribution Detection and Policy Adaptation Jul 8, 2025 MuJoCo Out-of-Distribution Detection
— Unverified 0Detecting and Mitigating Reward Hacking in Reinforcement Learning Systems: A Comprehensive Empirical Study Jul 8, 2025 MuJoCo Recommendation Systems
— Unverified 0CogniSQL-R1-Zero: Lightweight Reinforced Reasoning for Efficient SQL Generation Jul 8, 2025 Reinforcement Learning (RL) TAG
— Unverified 0FEVO: Financial Knowledge Expansion and Reasoning Evolution for Large Language Models Jul 8, 2025 Logical Reasoning Reinforcement Learning (RL)
— Unverified 0Robust Bandwidth Estimation for Real-Time Communication with Offline Reinforcement Learning Jul 8, 2025 Offline RL Reinforcement Learning (RL)
— Unverified 02048: Reinforcement Learning in a Delayed Reward Environment Jul 7, 2025 quantile regression reinforcement-learning
— Unverified 0Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning Jul 7, 2025 Reinforcement Learning (RL) Visual Reasoning
— Unverified 0Generalized Adaptive Transfer Network: Enhancing Transfer Learning in Reinforcement Learning Across Domains Jul 2, 2025 Atari Games Chatbot
Code Code Available 0Constructing Non-Markovian Decision Process via History Aggregator Jun 30, 2025 Decision Making Reinforcement Learning (RL)
Code Code Available 0Listener-Rewarded Thinking in VLMs for Image Preferences Jun 28, 2025 Memorization Reinforcement Learning (RL)
— Unverified 0A Survey of Continual Reinforcement Learning Jun 27, 2025 Continual Learning Decision Making
— Unverified 0Advancements and Challenges in Continual Reinforcement Learning: A Comprehensive Review Jun 27, 2025 Continual Learning Diversity
— Unverified 0Homogenization of Multi-agent Learning Dynamics in Finite-state Markov Games Jun 26, 2025 Reinforcement Learning (RL)
Code Code Available 0RL-Selector: Reinforcement Learning-Guided Data Selection via Redundancy Assessment Jun 26, 2025 Reinforcement Learning (RL)
— Unverified 0Optimising 4th-Order Runge-Kutta Methods: A Dynamic Heuristic Approach for Efficiency and Low Storage Jun 26, 2025 AutoML Computational Efficiency
— Unverified 0Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning Jun 26, 2025 Decision Making Hierarchical Reinforcement Learning
— Unverified 0Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning Jun 26, 2025 Action Generation Decision Making
— Unverified 0Curriculum-Guided Antifragile Reinforcement Learning for Secure UAV Deconfliction under Observation-Space Attacks Jun 26, 2025 Decision Making Reinforcement Learning (RL)
— Unverified 0