| The Invalsi Benchmarks: measuring Linguistic and Mathematical understanding of Large Language Models in Italian | Mar 27, 2024 | Language ModellingMath | —Unverified | 0 | 0 |
| Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models | Jun 16, 2025 | Math | —Unverified | 0 | 0 |
| First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning | Nov 14, 2023 | GSM8KMath | —Unverified | 0 | 0 |
| Fixation probabilities for the Moran process in evolutionary games with two strategies: graph shapes and large population asymptotics | Apr 30, 2018 | Math | —Unverified | 0 | 0 |
| Fixation probabilities for the Moran process with three or more strategies: general and coupling results | Nov 23, 2018 | Math | —Unverified | 0 | 0 |
| Building Math Agents with Multi-Turn Iterative Preference Learning | Sep 4, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration | Oct 22, 2024 | Math | —Unverified | 0 | 0 |
| The Logic of Political Survival Revisited: Consequences of Elite Uncertainty Under Authoritarian Rule | Aug 4, 2024 | Math | —Unverified | 0 | 0 |
| Formal Mathematical Reasoning: A New Frontier in AI | Dec 20, 2024 | Automated Theorem ProvingMath | —Unverified | 0 | 0 |
| The Long-Term Effects of Teachers' Gender Stereotypes | Dec 16, 2022 | Math | —Unverified | 0 | 0 |
| fPLSA: Learning Semantic Structures in Document Collections Using Foundation Models | Oct 7, 2024 | Math | —Unverified | 0 | 0 |
| FRACTAL: Fine-Grained Scoring from Aggregate Text Labels | Apr 7, 2024 | MathMultiple Instance Learning | —Unverified | 0 | 0 |
| BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning | Jan 31, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems | Oct 24, 2024 | BenchmarkingCommon Sense Reasoning | —Unverified | 0 | 0 |
| From fixation probabilities to d-player games: an inverse problem in evolutionary dynamics | Nov 20, 2018 | MathUnity | —Unverified | 0 | 0 |
| The Mathematics of Market Timing | Dec 13, 2017 | Math | —Unverified | 0 | 0 |
| From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting | Dec 18, 2023 | DiversityGSM8K | —Unverified | 0 | 0 |
| From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision | Mar 21, 2024 | Math | —Unverified | 0 | 0 |
| From Textbooks to Knowledge: A Case Study in Harvesting Axiomatic Knowledge from Textbooks to Solve Geometry Problems | Sep 1, 2017 | MathQuestion Answering | —Unverified | 0 | 0 |
| From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics | Mar 10, 2025 | MathQuestion Answering | —Unverified | 0 | 0 |
| Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens | Oct 18, 2024 | MathQuestion Answering | —Unverified | 0 | 0 |
| Bridging Offline and Online Reinforcement Learning for LLMs | Jun 26, 2025 | Instruction FollowingMath | —Unverified | 0 | 0 |
| Breaking Ties: Regression Discontinuity Design Meets Market Design | Dec 31, 2020 | Mathregression | —Unverified | 0 | 0 |
| Gamifying Math Education using Object Detection | Apr 13, 2023 | MathObject | —Unverified | 0 | 0 |
| GAPS: Geometry-Aware Problem Solver | Jan 29, 2024 | Geometry Problem SolvingMath | —Unverified | 0 | 0 |