SOTAVerified

Math

Papers

Showing 751775 of 1596 papers

TitleStatusHype
Leveraging Training Data in Few-Shot Prompting for Numerical ReasoningCode0
Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context LearningCode0
Leveraging Web-Crawled Data for High-Quality Fine-TuningCode0
Can We Use Small Models to Investigate Multimodal Fusion Methods?Code0
Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning ProcessCode0
Leveraging Label Semantics and Meta-Label Refinement for Multi-Label Question ClassificationCode0
Can Vision-Language Models Evaluate Handwritten Math?Code0
AI-Assisted Generation of Difficult Math QuestionsCode0
Linguistic Generalizability of Test-Time Scaling in Mathematical ReasoningCode0
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical BenchmarkCode0
OntoMath^PRO Ontology: A Linked Data Hub for MathematicsCode0
Examining the Robustness of Large Language Models across Language Complexity0
Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil0
Can Stories Help LLMs Reason? Curating Information Space Through Narrative0
Evolving LLMs' Self-Refinement Capability via Iterative Preference Optimization0
Can LLMs understand Math? -- Exploring the Pitfalls in Mathematical Reasoning0
A range characterization of the single-quadrant ADRT0
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages0
AI4Math: A Native Spanish Benchmark for University-Level Mathematical Reasoning in Large Language Models0
Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning0
Evaluating Robustness of Reward Models for Mathematical Reasoning0
Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation0
Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics0
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams0
A Graph-Based Synthetic Data Pipeline for Scaling High-Quality Reasoning Instructions0
Show:102550
← PrevPage 31 of 64Next →

No leaderboard results yet.