| Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning | Feb 20, 2025 | Mathreinforcement-learning | CodeCode Available | 7 | 5 |
| LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! | Feb 11, 2025 | Large Language ModelMath | CodeCode Available | 7 | 5 |
| EvoAgentX: An Automated Framework for Evolving Agentic Workflows | Jul 4, 2025 | Code GenerationMath | CodeCode Available | 7 | 5 |
| O1 Replication Journey: A Strategic Progress Report -- Part 1 | Oct 8, 2024 | Mathscientific discovery | CodeCode Available | 7 | 5 |
| AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration | Jun 1, 2023 | Autonomous DrivingCloud Computing | CodeCode Available | 6 | 5 |
| GPT-4 Technical Report | Mar 15, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 | 5 |
| Mistral 7B | Oct 10, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 | 5 |
| Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | Jan 28, 2022 | Common Sense ReasoningGSM8K | CodeCode Available | 6 | 5 |
| Qwen Technical Report | Sep 28, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| Process Reinforcement through Implicit Rewards | Feb 3, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 5 | 5 |