| MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained Classification | Apr 7, 2024 | Image ComprehensionMath | CodeCode Available | 0 |
| Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models | Mar 27, 2025 | Data VisualizationMath | CodeCode Available | 0 |
| Unsupervised learning-based calibration scheme for Rough Bergomi model | Dec 3, 2024 | Math | CodeCode Available | 0 |
| Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision | Jan 14, 2025 | Instruction FollowingMath | CodeCode Available | 0 |
| A mixed policy to improve performance of language models on math problems | Jul 17, 2023 | GSM8KMath | CodeCode Available | 0 |
| Teaching Machines to Code: Neural Markup Generation with Visual Attention | Feb 15, 2018 | MathOptical Character Recognition (OCR) | CodeCode Available | 0 |
| Solving Math Word Problems with Multi-Encoders and Multi-Decoders | Dec 1, 2020 | DecoderMath | CodeCode Available | 0 |
| CoinMath: Harnessing the Power of Coding Instruction for Math LLMs | Dec 16, 2024 | DescriptiveMath | CodeCode Available | 0 |
| ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark | May 28, 2025 | Math | CodeCode Available | 0 |
| Solving Math Word Problems with Reexamination | Oct 14, 2023 | DescriptiveMath | CodeCode Available | 0 |