Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning Mar 20, 2025 Decision Making Language Modeling
Code Code Available 4Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Mar 18, 2025 3D Face Animation Common Sense Reasoning
Code Code Available 4LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Mar 10, 2025 Logical Reasoning Multimodal Reasoning
Code Code Available 4MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Mar 10, 2025 Multimodal Reasoning Reinforcement Learning (RL)
Code Code Available 4R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Mar 7, 2025 RAG Reinforcement Learning (RL)
Code Code Available 4DeepRetrieval: Hacking Real Search Engines and Retrievers with Large Language Models via Reinforcement Learning Feb 28, 2025 Information Retrieval reinforcement-learning
Code Code Available 4TDMPBC: Self-Imitative Reinforcement Learning for Humanoid Robot Control Feb 24, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 4Diffusion Policy Policy Optimization Sep 1, 2024 continuous-control Continuous Control
Code Code Available 4SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning Aug 14, 2024 CPU Motion Planning
Code Code Available 4Pearl: A Production-ready Reinforcement Learning Agent Dec 6, 2023 Benchmarking reinforcement-learning
Code Code Available 4RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark Jun 29, 2023 Combinatorial Optimization Computational Efficiency
Code Code Available 4TorchRL: A data-driven decision-making library for PyTorch Jun 1, 2023 Computational Efficiency Decision Making
Code Code Available 4Let's Verify Step by Step May 31, 2023 Active Learning Math
Code Code Available 4Mastering Diverse Domains through World Models Jan 10, 2023 Atari Games 100k Decision Making
Code Code Available 4DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality Oct 25, 2022 Deep Reinforcement Learning GPU
Code Code Available 4Discovering faster matrix multiplication algorithms with reinforcement learning Oct 5, 2022 Deep Reinforcement Learning reinforcement-learning
Code Code Available 4RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem Nov 25, 2020 reinforcement-learning Reinforcement Learning
Code Code Available 4RLlib: Abstractions for Distributed Reinforcement Learning Dec 26, 2017 reinforcement-learning Reinforcement Learning
Code Code Available 4Ray: A Distributed Framework for Emerging AI Applications Dec 16, 2017 reinforcement-learning Reinforcement Learning
Code Code Available 4VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning May 24, 2025 GPU Reinforcement Learning (RL)
Code Code Available 3R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO May 22, 2025 Reinforcement Learning (RL)
Code Code Available 3Arctic-Text2SQL-R1: Simple Rewards, Strong Reasoning in Text-to-SQL May 22, 2025 Natural Language Understanding Reinforcement Learning (RL)
Code Code Available 3Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning May 22, 2025 Reinforcement Learning (RL)
Code Code Available 3General-Reasoner: Advancing LLM Reasoning Across All Domains May 20, 2025 All Math
Code Code Available 3ExTrans: Multilingual Deep Reasoning Translation via Exemplar-Enhanced Reinforcement Learning May 19, 2025 Machine Translation reinforcement-learning
Code Code Available 3Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning May 18, 2025 Reinforcement Learning (RL) Visual Grounding
Code Code Available 3Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward May 18, 2025 GPU Graph Matching
Code Code Available 3OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning May 13, 2025 Reinforcement Learning (RL) Visual Reasoning
Code Code Available 3R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning May 5, 2025 Reinforcement Learning (RL)
Code Code Available 3Tina: Tiny Reasoning Models via LoRA Apr 22, 2025 Reinforcement Learning (RL)
Code Code Available 3Learning to Reason under Off-Policy Guidance Apr 21, 2025 Math Reinforcement Learning (RL)
Code Code Available 3DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning Apr 15, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 3A Clean Slate for Offline Reinforcement Learning Apr 15, 2025 Offline RL reinforcement-learning
Code Code Available 3A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce Apr 15, 2025 Reinforcement Learning (RL)
Code Code Available 3Perception-R1: Pioneering Perception Policy with Reinforcement Learning Apr 10, 2025 reinforcement-learning Reinforcement Learning
Code Code Available 3Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving Apr 3, 2025 Reinforcement Learning (RL)
Code Code Available 3MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse Mar 24, 2025 Layout Generation Reinforcement Learning (RL)
Code Code Available 3Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Mar 20, 2025 Mathematical Reasoning Reinforcement Learning (RL)
Code Code Available 3Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering Mar 14, 2025 Audio Question Answering Question Answering
Code Code Available 3AlphaDrive: Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning Mar 10, 2025 Autonomous Driving Common Sense Reasoning
Code Code Available 3Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Mar 3, 2025 Reinforcement Learning (RL)
Code Code Available 3Demystifying Long Chain-of-Thought Reasoning in LLMs Feb 5, 2025 Reinforcement Learning (RL)
Code Code Available 3Flow Q-Learning Feb 4, 2025 Action Generation D4RL
Code Code Available 3Test-Time Training Scaling Laws for Chemical Exploration in Drug Design Jan 31, 2025 Drug Design Drug Discovery
Code Code Available 3SINERGYM -- A virtual testbed for building energy optimization with Reinforcement Learning Dec 11, 2024 continuous-control Continuous Control
Code Code Available 3Reinforcement Learning Enhanced LLMs: A Survey Dec 5, 2024 reinforcement-learning Reinforcement Learning
Code Code Available 3o1-Coder: an o1 Replication for Coding Nov 29, 2024 Reinforcement Learning (RL)
Code Code Available 3OGBench: Benchmarking Offline Goal-Conditioned RL Oct 26, 2024 Benchmarking reinforcement-learning
Code Code Available 3Streaming Deep Reinforcement Learning Finally Works Oct 18, 2024 Atari Games Deep Reinforcement Learning
Code Code Available 3CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control Oct 4, 2024 Motion Generation Reinforcement Learning (RL)
Code Code Available 3