| Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning | May 18, 2025 | GSM8KIn-Context Learning | CodeCode Available | 1 |
| Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models | Nov 10, 2023 | GSM8KMemorization | CodeCode Available | 1 |
| GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models | Oct 7, 2024 | GSM8KLogical Reasoning | CodeCode Available | 1 |
| Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries | Dec 12, 2024 | 4kGSM8K | CodeCode Available | 1 |
| GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers | Dec 12, 2024 | GSM8KPrompt Engineering | CodeCode Available | 1 |
| Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability | Nov 29, 2024 | GSM8KMath | CodeCode Available | 1 |
| Learning From Mistakes Makes LLM Better Reasoner | Oct 31, 2023 | GSM8KMath | CodeCode Available | 1 |
| Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions | May 28, 2022 | Arithmetic ReasoningEfficient Exploration | CodeCode Available | 1 |
| Learning Goal-Conditioned Representations for Language Reward Models | Jul 18, 2024 | GSM8KMath | CodeCode Available | 1 |
| Large Language Models as Optimizers | Sep 7, 2023 | GSM8K | CodeCode Available | 1 |