| Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B | Jun 11, 2024 | Decision MakingGSM8K | CodeCode Available | 5 |
| DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning | May 20, 2025 | HallucinationMathematical Reasoning | CodeCode Available | 5 |
| Search-o1: Agentic Search-Enhanced Large Reasoning Models | Jan 9, 2025 | Code Generation | CodeCode Available | 5 |
| Group-in-Group Policy Optimization for LLM Agent Training | May 16, 2025 | GPUMathematical Reasoning | CodeCode Available | 5 |
| WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct | Aug 18, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 5 |
| MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine | Jul 11, 2024 | Contrastive LearningLanguage Modelling | CodeCode Available | 4 |
| ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates | Feb 10, 2025 | Hierarchical Reinforcement LearningLanguage Modeling | CodeCode Available | 4 |
| Knowledge Fusion of Large Language Models | Jan 19, 2024 | Code GenerationCommon Sense Reasoning | CodeCode Available | 4 |
| ChatGPT for Robotics: Design Principles and Model Abilities | Feb 20, 2023 | Mathematical ReasoningPrompt Engineering | CodeCode Available | 4 |
| Galactica: A Large Language Model for Science | Nov 16, 2022 | AnachronismsBias Detection | CodeCode Available | 4 |