| EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Jan 26, 2024 | Code GenerationInstruction Following | CodeCode Available | 7 |
| Mistral 7B | Oct 10, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 |
| Gorilla: Large Language Model Connected with Massive APIs | May 24, 2023 | HallucinationLanguage Modeling | CodeCode Available | 6 |
| Sparks of Artificial General Intelligence: Early experiments with GPT-4 | Mar 22, 2023 | Arithmetic ReasoningMathematical Reasoning | CodeCode Available | 6 |
| Training Compute-Optimal Large Language Models | Mar 29, 2022 | AnachronismsAnalogical Similarity | CodeCode Available | 6 |
| R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration | May 30, 2025 | Mathematical Reasoning | CodeCode Available | 5 |
| DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning | May 20, 2025 | HallucinationMathematical Reasoning | CodeCode Available | 5 |
| Group-in-Group Policy Optimization for LLM Agent Training | May 16, 2025 | GPUMathematical Reasoning | CodeCode Available | 5 |
| DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal Decomposition | Apr 30, 2025 | Automated Theorem ProvingLarge Language Model | CodeCode Available | 5 |
| Kimi-VL Technical Report | Apr 10, 2025 | Long-Context UnderstandingMathematical Reasoning | CodeCode Available | 5 |