| MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark | Aug 14, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty | Aug 13, 2024 | Mathematical ReasoningQuestion Answering | CodeCode Available | 0 |
| MathLearner: A Large Language Model Agent Framework for Learning to Solve Mathematical Problems | Aug 3, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AI-Assisted Generation of Difficult Math Questions | Jul 30, 2024 | MathMathematical Reasoning | CodeCode Available | 0 |
| Optimizing Numerical Estimation and Operational Efficiency in the Legal Domain through Large Language Models | Jul 26, 2024 | Mathematical Reasoning | —Unverified | 0 |
| Reliable Reasoning Beyond Natural Language | Jul 16, 2024 | GSM8KMathematical Reasoning | —Unverified | 0 |
| A Comprehensive Evaluation of Large Language Models on Temporal Event Forecasting | Jul 16, 2024 | Mathematical ReasoningQuestion Answering | —Unverified | 0 |
| Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together | Jul 15, 2024 | Arithmetic ReasoningLanguage Modeling | —Unverified | 0 |
| Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model | Jul 14, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models | Jul 12, 2024 | GSM8KMath | —Unverified | 0 |