| A Joint Probabilistic Classification Model of Relevant and Irrelevant Sentences in Mathematical Word Problems | Nov 21, 2014 | ClassificationGeneral Classification | —Unverified | 0 | 0 |
| Working memory capacity and gender | Mar 21, 2017 | Math | —Unverified | 0 | 0 |
| Accelerating Chain-of-Thought Reasoning: When Goal-Gradient Importance Meets Dynamic Skipping | May 13, 2025 | Domain GeneralizationGSM8K | —Unverified | 0 | 0 |
| SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning | May 5, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| AI, write an essay for me: A large-scale comparison of human-written versus ChatGPT-generated essays | Apr 24, 2023 | Math | —Unverified | 0 | 0 |
| Simplified Energy Landscape for Modularity Using Total Variation | Jul 28, 2017 | Math | —Unverified | 0 | 0 |
| Simulating LLM-to-LLM Tutoring for Multilingual Math Feedback | Jun 5, 2025 | Math | —Unverified | 0 | 0 |
| SIRI-Bench: Challenging VLMs' Spatial Intelligence through Complex Reasoning Tasks | Jun 17, 2025 | MathSpatial Reasoning | —Unverified | 0 | 0 |
| VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks | Jul 17, 2025 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models | Aug 1, 2023 | In-Context LearningMath | —Unverified | 0 | 0 |