| Cumulative Reasoning with Large Language Models | Aug 8, 2023 | Decision MakingLogical Reasoning | CodeCode Available | 2 |
| MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities | Aug 4, 2023 | MathMM-Vet | CodeCode Available | 2 |
| LeanDojo: Theorem Proving with Retrieval-Augmented Language Models | Jun 27, 2023 | Automated Theorem ProvingGPU | CodeCode Available | 2 |
| Progressive-Hint Prompting Improves Reasoning in Large Language Models | Apr 19, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 2 |
| AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models | Apr 13, 2023 | Decision MakingMath | CodeCode Available | 2 |
| Specializing Smaller Language Models towards Multi-Step Reasoning | Jan 30, 2023 | MathModel Selection | CodeCode Available | 2 |
| A Survey of Deep Learning for Mathematical Reasoning | Dec 20, 2022 | Deep LearningMath | CodeCode Available | 2 |
| Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem | Oct 21, 2022 | Contrastive LearningMath | CodeCode Available | 2 |
| Language Models are Multilingual Chain-of-Thought Reasoners | Oct 6, 2022 | GSM8KMath | CodeCode Available | 2 |
| PaLM: Scaling Language Modeling with Pathways | Apr 5, 2022 | Auto DebuggingCode Generation | CodeCode Available | 2 |