GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning Apr 3, 2025 Reinforcement Learning (RL)
Code Code Available 2Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models Apr 3, 2025 GSM8K Reinforcement Learning (RL)
Code Code Available 0De Novo Molecular Design Enabled by Direct Preference Optimization and Curriculum Learning Apr 2, 2025 Drug Discovery Reinforcement Learning (RL)
— Unverified 0GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning Apr 2, 2025 Decision Making Diagnostic
Code Code Available 1Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models? Apr 2, 2025 Attribute Reinforcement Learning (RL)
Code Code Available 1ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning Apr 2, 2025 Reinforcement Learning (RL)
Code Code Available 1Probabilistic Curriculum Learning for Goal-Based Reinforcement Learning Apr 2, 2025 continuous-control Continuous Control
— Unverified 0How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study Apr 1, 2025 Code Generation Math
— Unverified 0Grounding Multimodal LLMs to Embodied Agents that Ask for Help with Reinforcement Learning Apr 1, 2025 Reinforcement Learning (RL) Vision-Language-Action
— Unverified 0Probabilistically safe and efficient model-based Reinforcement Learning Apr 1, 2025 Model-based Reinforcement Learning Model Predictive Control
Code Code Available 1Value Iteration for Learning Concurrently Executable Robotic Control Tasks Apr 1, 2025 Reinforcement Learning (RL)
Code Code Available 0MPCritic: A plug-and-play MPC architecture for reinforcement learning Apr 1, 2025 Model Predictive Control Reinforcement Learning (RL)
Code Code Available 1JudgeLRM: Large Reasoning Models as a Judge Mar 31, 2025 Reinforcement Learning (RL)
— Unverified 0Nuclear Microreactor Control with Deep Reinforcement Learning Mar 31, 2025 Deep Reinforcement Learning reinforcement-learning
— Unverified 0Noise-based reward-modulated learning Mar 31, 2025 Reinforcement Learning (RL)
— Unverified 0Fair Dynamic Spectrum Access via Fully Decentralized Multi-Agent Reinforcement Learning Mar 31, 2025 Fairness Multi-agent Reinforcement Learning
— Unverified 0A Survey of Reinforcement Learning-Based Motion Planning for Autonomous Driving: Lessons Learned from a Driving Task Perspective Mar 31, 2025 Autonomous Driving Decision Making
— Unverified 0Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Mar 31, 2025 Logical Reasoning Multiple-choice
Code Code Available 2Reinforcement Learning for Safe Autonomous Two Device Navigation of Cerebral Vessels in Mechanical Thrombectomy Mar 31, 2025 Autonomous Navigation Navigate
— Unverified 0Accelerating High-Efficiency Organic Photovoltaic Discovery via Pretrained Graph Neural Networks and Generative Reinforcement Learning Mar 31, 2025 Reinforcement Learning (RL)
— Unverified 0HACTS: a Human-As-Copilot Teleoperation System for Robot Learning Mar 31, 2025 Autonomous Vehicles Imitation Learning
— Unverified 0Advanced Deep Learning and Large Language Models: Comprehensive Insights for Cancer Detection Mar 30, 2025 Diagnostic Federated Learning
— Unverified 0Reinforcement Learning for Active Matter Mar 30, 2025 reinforcement-learning Reinforcement Learning
— Unverified 0Reinforcement Learning-based Token Pruning in Vision Transformers: A Markov Game Approach Mar 30, 2025 Decision Making Reinforcement Learning (RL)
Code Code Available 0A Systematic Decade Review of Trip Route Planning with Travel Time Estimation based on User Preferences and Behavior Mar 30, 2025 Data Integration Federated Learning
— Unverified 0Handling Delay in Real-Time Reinforcement Learning Mar 30, 2025 MuJoCo reinforcement-learning
Code Code Available 0Multi-Agent Reinforcement Learning for Graph Discovery in D2D-Enabled Federated Learning Mar 29, 2025 Diversity Federated Learning
— Unverified 0Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL Mar 29, 2025 Natural Language Understanding Reinforcement Learning (RL)
— Unverified 0RL2Grid: Benchmarking Reinforcement Learning in Power Grid Operations Mar 29, 2025 Benchmarking reinforcement-learning
— Unverified 0FLAM: Foundation Model-Based Body Stabilization for Humanoid Locomotion and Manipulation Mar 28, 2025 Reinforcement Learning (RL)
— Unverified 0Reinforcement Learning for Machine Learning Model Deployment: Evaluating Multi-Armed Bandits in ML Ops Environments Mar 28, 2025 Management Model Selection
— Unverified 0Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning Mar 28, 2025 Efficient Exploration Language Modeling
— Unverified 0Bresa: Bio-inspired Reflexive Safe Reinforcement Learning for Contact-Rich Robotic Tasks Mar 27, 2025 Reinforcement Learning (RL) Safe Exploration
— Unverified 0Pretrained Bayesian Non-parametric Knowledge Prior in Robotic Long-Horizon Reinforcement Learning Mar 27, 2025 Reinforcement Learning (RL)
Code Code Available 0Reward Design for Reinforcement Learning Agents Mar 27, 2025 Meta-Learning reinforcement-learning
Code Code Available 0Controlling Large Language Model with Latent Actions Mar 27, 2025 CoLA Language Modeling
Code Code Available 0ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation Mar 27, 2025 Question Answering RAG
Code Code Available 1UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning Mar 27, 2025 Model Optimization Reinforcement Learning (RL)
Code Code Available 2Video-R1: Reinforcing Video Reasoning in MLLMs Mar 27, 2025 MVBench Reinforcement Learning (RL)
Code Code Available 4Reasoning Beyond Limits: Advances and Open Problems for LLMs Mar 26, 2025 Mixture-of-Experts RAG
— Unverified 0TAR: Teacher-Aligned Representations via Contrastive Learning for Quadrupedal Locomotion Mar 26, 2025 Contrastive Learning Reinforcement Learning (RL)
— Unverified 0Synthesizing world models for bilevel planning Mar 26, 2025 Large Language Model Program Synthesis
— Unverified 0The Crucial Role of Problem Formulation in Real-World Reinforcement Learning Mar 26, 2025 Reinforcement Learning (RL)
— Unverified 0Harmonia: A Multi-Agent Reinforcement Learning Approach to Data Placement and Migration in Hybrid Storage Systems Mar 26, 2025 Management Multi-agent Reinforcement Learning
— Unverified 0Understanding R1-Zero-Like Training: A Critical Perspective Mar 26, 2025 Reinforcement Learning (RL)
Code Code Available 5Learning Adaptive Dexterous Grasping from Single Demonstrations Mar 26, 2025 Reinforcement Learning (RL)
— Unverified 0Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging Mar 26, 2025 Prompt Engineering Reinforcement Learning (RL)
Code Code Available 2Offline Reinforcement Learning with Discrete Diffusion Skills Mar 26, 2025 Decoder Offline RL
— Unverified 0Generalized Phase Pressure Control Enhanced Reinforcement Learning for Traffic Signal Control Mar 26, 2025 Reinforcement Learning (RL) Traffic Signal Control
Code Code Available 0Model-Based Offline Reinforcement Learning with Adversarial Data Augmentation Mar 26, 2025 D4RL Data Augmentation
— Unverified 0