| A Careful Examination of Large Language Model Performance on Grade School Arithmetic | May 1, 2024 | GSM8KLanguage Modeling | —Unverified | 0 |
| Iterative Reasoning Preference Optimization | Apr 30, 2024 | ARCGSM8K | —Unverified | 0 |
| Markovian Transformers for Informative Language Modeling | Apr 29, 2024 | GSM8KInformativeness | CodeCode Available | 1 |
| LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding | Apr 25, 2024 | GSM8KHellaSwag | CodeCode Available | 3 |
| Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Solvers for Math Word Problems | Apr 23, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| PARAMANU-GANITA: Language Model with Mathematical Capabilities | Apr 22, 2024 | Domain AdaptationGSM8K | —Unverified | 0 |
| Relevant or Random: Can LLMs Truly Perform Analogical Reasoning? | Apr 19, 2024 | GSM8K | —Unverified | 0 |
| Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models | Apr 18, 2024 | GSM8KMMLU | —Unverified | 0 |
| Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing | Apr 18, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 1 |
| Efficient Contextual LLM Cascades through Budget-Constrained Policy Learning | Apr 17, 2024 | GSM8KNavigate | —Unverified | 0 |