| Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory | Apr 10, 2025 | MathMMLU | CodeCode Available | 3 | 5 |
| MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning | Jun 13, 2024 | Instruction FollowingMath | CodeCode Available | 3 | 5 |
| Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks | Nov 22, 2022 | Math | CodeCode Available | 3 | 5 |
| Learning to Reason under Off-Policy Guidance | Apr 21, 2025 | MathReinforcement Learning (RL) | CodeCode Available | 3 | 5 |
| TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON | Jul 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models | Jun 13, 2024 | MathQuantization | CodeCode Available | 2 | 5 |
| MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning | Sep 11, 2023 | MathMathematical Reasoning | CodeCode Available | 2 | 5 |
| Agent Lumos: Unified and Modular Training for Open-Source Language Agents | Nov 9, 2023 | MathQuestion Answering | CodeCode Available | 2 | 5 |
| AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models | Apr 13, 2023 | Decision MakingMath | CodeCode Available | 2 | 5 |
| MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems | Apr 6, 2024 | Logical ReasoningMath | CodeCode Available | 2 | 5 |