| Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting | Aug 18, 2024 | HumanEvalMathematical Reasoning | —Unverified | 0 |
| Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning | Aug 16, 2024 | MathMathematical Reasoning | CodeCode Available | 1 |
| MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark | Aug 14, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty | Aug 13, 2024 | Mathematical ReasoningQuestion Answering | CodeCode Available | 0 |
| Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement | Aug 6, 2024 | Code GenerationDisentanglement | CodeCode Available | 1 |
| MathLearner: A Large Language Model Agent Framework for Learning to Solve Mathematical Problems | Aug 3, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AI-Assisted Generation of Difficult Math Questions | Jul 30, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process | Jul 29, 2024 | GSM8KMath | CodeCode Available | 2 |
| SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages | Jul 29, 2024 | DiversityInstruction Following | CodeCode Available | 2 |
| Optimizing Numerical Estimation and Operational Efficiency in the Legal Domain through Large Language Models | Jul 26, 2024 | Mathematical Reasoning | —Unverified | 0 |