| Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination | Jun 10, 2023 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Investigating the Efficacy of Large Language Models in Reflective Assessment Methods through Chain of Thoughts Prompting | Sep 30, 2023 | Math | —Unverified | 0 | 0 |
| Thinking Outside the (Gray) Box: A Context-Based Score for Assessing Value and Originality in Neural Text Generation | Feb 18, 2025 | DiversityMath | —Unverified | 0 | 0 |
| IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations | Apr 1, 2024 | BenchmarkingMath | —Unverified | 0 | 0 |
| Solving Functional Optimization with Deep Networks and Variational Principles | Oct 8, 2024 | Math | —Unverified | 0 | 0 |
| Is your LLM trapped in a Mental Set? Investigative study on how mental sets affect the reasoning capabilities of LLMs | Jan 21, 2025 | GSM8KIn-Context Learning | —Unverified | 0 | 0 |
| Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist | Jul 11, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Iterative Reasoning Preference Optimization | Apr 30, 2024 | ARCGSM8K | —Unverified | 0 | 0 |
| Yi-Lightning Technical Report | Dec 2, 2024 | ChatbotLarge Language Model | —Unverified | 0 | 0 |
| Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models | Jun 16, 2025 | Mathreinforcement-learning | —Unverified | 0 | 0 |