Real-Time Optimal Design of Experiment for Parameter Identification of Li-Ion Cell Electrochemical Model Apr 22, 2025 Experimental Design Reinforcement Learning (RL)
— Unverified 0SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning Apr 22, 2025 Multiple-choice reinforcement-learning
— Unverified 0StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation Apr 22, 2025 Reinforcement Learning (RL) Scheduling
— Unverified 0Insights from Verification: Training a Verilog Generation LLM with Reinforcement Learning with Testbench Feedback Apr 22, 2025 Code Generation Hallucination
— Unverified 0LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning Apr 21, 2025 Language Modeling Language Modelling
— Unverified 0Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL Apr 21, 2025 Reinforcement Learning (RL) Zero-Shot Learning
— Unverified 0FlowReasoner: Reinforcing Query-Level Meta-Agents Apr 21, 2025 Reinforcement Learning (RL)
Code Code Available 2Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning Apr 21, 2025 All Form
Code Code Available 2Learning to Reason under Off-Policy Guidance Apr 21, 2025 Math Reinforcement Learning (RL)
Code Code Available 3OTC: Optimal Tool Calls via Reinforcement Learning Apr 21, 2025 Math reinforcement-learning
— Unverified 0Dynamic Contrastive Skill Learning with State-Transition Based Skill Clustering and Dynamic Length Adjustment Apr 21, 2025 Contrastive Learning Decision Making
— Unverified 0Relation-R1: Cognitive Chain-of-Thought Guided Reinforcement Learning for Unified Relational Comprehension Apr 20, 2025 Graph Generation Reinforcement Learning (RL)
— Unverified 0Generative Auto-Bidding with Value-Guided Explorations Apr 20, 2025 Reinforcement Learning (RL)
Code Code Available 2Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning Apr 19, 2025 Computational Efficiency Q-Learning
— Unverified 0Quantum-Enhanced Reinforcement Learning for Power Grid Security Assessment Apr 19, 2025 Computational Efficiency Navigate
— Unverified 0Improving RL Exploration for LLM Reasoning through Retrospective Replay Apr 19, 2025 Code Generation Mathematical Reasoning
— Unverified 0Unlearning Works Better Than You Think: Local Reinforcement-Based Selection of Auxiliary Objectives Apr 19, 2025 Reinforcement Learning (RL)
— Unverified 0Improving Generalization in Intent Detection: GRPO with Reward-Based Curriculum Sampling Apr 18, 2025 Intent Detection Reinforcement Learning (RL)
— Unverified 0Prejudge-Before-Think: Enhancing Large Language Models at Test-Time by Process Prejudge Reasoning Apr 18, 2025 Reinforcement Learning (RL)
Code Code Available 0Compile Scene Graphs with Reinforcement Learning Apr 18, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 1SwitchMT: An Adaptive Context Switching Methodology for Scalable Multi-Task Learning in Intelligent Autonomous Agents Apr 18, 2025 Atari Games Multi-Task Learning
— Unverified 0Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning Apr 18, 2025 All GSM8K
— Unverified 0Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning Apr 17, 2025 Multimodal Reasoning Reinforcement Learning (RL)
Code Code Available 2Evolutionary Policy Optimization Apr 17, 2025 Policy Gradient Methods Reinforcement Learning (RL)
— Unverified 0TraCeS: Trajectory Based Credit Assignment From Sparse Safety Feedback Apr 17, 2025 continuous-control Continuous Control
— Unverified 0LLMs Meet Finance: Fine-Tuning Foundation Models for the Open FinLLM Leaderboard Apr 17, 2025 Reinforcement Learning (RL)
— Unverified 0NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Apr 17, 2025 Data Augmentation Diversity
Code Code Available 2Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration Apr 17, 2025 Data Augmentation Human-Object Interaction Detection
— Unverified 0RL-PINNs: Reinforcement Learning-Driven Adaptive Sampling for Efficient Training of PINNs Apr 17, 2025 Reinforcement Learning (RL)
— Unverified 0SkyReels-V2: Infinite-length Film Generative Model Apr 17, 2025 Large Language Model model
Code Code Available 9ToolRL: Reward is All Tool Learning Needs Apr 16, 2025 All Reinforcement Learning (RL)
Code Code Available 0VIPO: Value Function Inconsistency Penalized Offline Reinforcement Learning Apr 16, 2025 D4RL Offline RL
— Unverified 0pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild Apr 16, 2025 Benchmarking object-detection
— Unverified 0d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning Apr 16, 2025 Language Modeling Language Modelling
— Unverified 0Evolutionary Reinforcement Learning for Interpretable Decision-Making in Supply Chain Management Apr 16, 2025 Decision Making Management
— Unverified 0Control of Rayleigh-Bénard Convection: Effectiveness of Reinforcement Learning in the Turbulent Regime Apr 16, 2025 Reinforcement Learning (RL)
Code Code Available 0ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Apr 15, 2025 Math Mathematical Reasoning
— Unverified 0Position Paper: Rethinking Privacy in RL for Sequential Decision-making in the Age of LLMs Apr 15, 2025 Autonomous Vehicles Decision Making
— Unverified 0Data driven approach towards more efficient Newton-Raphson power flow calculation for distribution grids Apr 15, 2025 Reinforcement Learning (RL)
Code Code Available 0Achieving Tighter Finite-Time Rates for Heterogeneous Federated Stochastic Approximation under Markovian Sampling Apr 15, 2025 Reinforcement Learning (RL)
— Unverified 0Revealing Covert Attention by Analyzing Human and Reinforcement Learning Agent Gameplay Apr 15, 2025 Reinforcement Learning (RL)
— Unverified 0A Clean Slate for Offline Reinforcement Learning Apr 15, 2025 Offline RL reinforcement-learning
Code Code Available 3Hallucination-Aware Generative Pretrained Transformer for Cooperative Aerial Mobility Control Apr 15, 2025 Hallucination Reinforcement Learning (RL)
— Unverified 0Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models Apr 15, 2025 Humanoid Control Reinforcement Learning (RL)
Code Code Available 4Next-Future: Sample-Efficient Policy Learning for Robotic-Arm Tasks Apr 15, 2025 Multi-Goal Reinforcement Learning Reinforcement Learning (RL)
— Unverified 0DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning Apr 15, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 3A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce Apr 15, 2025 Reinforcement Learning (RL)
Code Code Available 3ReZero: Enhancing LLM search ability by trying one-more-time Apr 15, 2025 Language Modeling Language Modelling
— Unverified 0MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning Apr 14, 2025 Machine Translation Reinforcement Learning (RL)
Code Code Available 2CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent Apr 13, 2025 Large Language Model Recommendation Systems
— Unverified 0