| Qwen Technical Report | Sep 28, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 |
| AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration | Jun 1, 2023 | Autonomous DrivingCloud Computing | CodeCode Available | 6 |
| GPT-4 Technical Report | Mar 15, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 |
| Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | Jan 28, 2022 | Common Sense ReasoningGSM8K | CodeCode Available | 6 |
| Reinforcement Learning from Human Feedback | Apr 16, 2025 | MathPhilosophy | CodeCode Available | 5 |
| Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models | Mar 9, 2025 | MathMultimodal Reasoning | CodeCode Available | 5 |
| LIMO: Less is More for Reasoning | Feb 5, 2025 | MathMathematical Reasoning | CodeCode Available | 5 |
| Process Reinforcement through Implicit Rewards | Feb 3, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 5 |
| Free Process Rewards without Process Labels | Dec 2, 2024 | Math | CodeCode Available | 5 |
| OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models | Oct 12, 2024 | Mathreinforcement-learning | CodeCode Available | 5 |
| LiveBench: A Challenging, Contamination-Limited LLM Benchmark | Jun 27, 2024 | ArticlesInstruction Following | CodeCode Available | 5 |
| Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B | Jun 11, 2024 | Decision MakingGSM8K | CodeCode Available | 5 |
| MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkit | Apr 22, 2024 | Math | CodeCode Available | 5 |
| Evolutionary Optimization of Model Merging Recipes | Mar 19, 2024 | Evolutionary AlgorithmsMath | CodeCode Available | 5 |
| Common 7B Language Models Already Possess Strong Math Capabilities | Mar 7, 2024 | GSM8KMath | CodeCode Available | 5 |
| WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct | Aug 18, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 5 |
| Energy-Based Transformers are Scalable Learners and Thinkers | Jul 2, 2025 | DenoisingImage Denoising | CodeCode Available | 4 |
| Skywork Open Reasoner 1 Technical Report | May 28, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 4 |
| MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision | May 19, 2025 | MathMathematical Reasoning | CodeCode Available | 4 |
| AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset | Apr 23, 2025 | MathMathematical Reasoning | CodeCode Available | 4 |
| Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond | Mar 13, 2025 | Domain GeneralizationMath | CodeCode Available | 4 |
| CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction | Feb 11, 2025 | Code GenerationMath | CodeCode Available | 4 |
| ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates | Feb 10, 2025 | Hierarchical Reinforcement LearningLanguage Modeling | CodeCode Available | 4 |
| InternLM2.5-StepProver: Advancing Automated Theorem Proving via Expert Iteration on Large-Scale LEAN Problems | Oct 21, 2024 | Automated Theorem ProvingCPU | CodeCode Available | 4 |
| SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights | Oct 11, 2024 | GSM8KMath | CodeCode Available | 4 |