| A Robustly Optimized Long Text to Math Models for Numerical Reasoning On FinQA | Jun 29, 2022 | Math | CodeCode Available | 0 | 5 |
| LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation | Dec 10, 2024 | Math | CodeCode Available | 0 | 5 |
| Library Learning Doesn't: The Curious Case of the Single-Use "Library" | Oct 26, 2024 | MathMathematical Reasoning | CodeCode Available | 0 | 5 |
| SBI-RAG: Enhancing Math Word Problem Solving for Students through Schema-Based Instruction and Retrieval-Augmented Generation | Oct 17, 2024 | GSM8KLanguage Modeling | CodeCode Available | 0 | 5 |
| Leveraging Web-Crawled Data for High-Quality Fine-Tuning | Aug 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| FINNger -- Applying artificial intelligence to ease math learning for children | May 26, 2021 | Hand Pose EstimationMath | CodeCode Available | 0 | 5 |
| ChatBench: From Static Benchmarks to Human-AI Evaluation | Mar 22, 2025 | MathMMLU | CodeCode Available | 0 | 5 |
| Leveraging Training Data in Few-Shot Prompting for Numerical Reasoning | May 29, 2023 | Language ModellingLarge Language Model | CodeCode Available | 0 | 5 |
| LLM Performance for Code Generation on Noisy Tasks | May 29, 2025 | BenchmarkingCode Generation | CodeCode Available | 0 | 5 |
| MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model Training | Feb 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |