SOTAVerified

Math

Papers

Showing 876900 of 1596 papers

TitleStatusHype
Reverse Thinking Makes LLMs Stronger Reasoners0
Mars-PO: Multi-Agent Reasoning System Preference Optimization0
A Lean Dataset for International Math Olympiad: Small Steps towards Writing Math Proofs for Hard Problems0
Embracing AI in Education: Understanding the Surge in Large Language Model Use by Secondary Students0
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTSCode0
Learning by Analogy: Enhancing Few-Shot Prompting for Math Word Problem Solving with Computational Graph-Based Retrieval0
Unraveling Arithmetic in Large Language Models: The Role of Algebraic Structures0
Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training0
MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMsCode0
RESOLVE: Relational Reasoning with Symbolic and Object-Level Features Using Vector Symbolic ProcessingCode0
OpenAI-o1 AB Testing: Does the o1 model really do good reasoning in math problem solving?0
VISTA: Visual Integrated System for Tailored Automation in Math Problem Generation Using LLM0
Meta-Reasoning Improves Tool Use in Large Language ModelsCode0
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams0
Self-Consistency Preference Optimization0
Automatic Generation of Question Hints for Mathematics Problems using Large Language Models in Educational Technology0
Leveraging Label Semantics and Meta-Label Refinement for Multi-Label Question ClassificationCode0
Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models0
STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing0
DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models0
Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses0
Improving Math Problem Solving in Large Language Models Through Categorization and Strategy Tailoring0
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation0
Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks?Code0
Library Learning Doesn't: The Curious Case of the Single-Use "Library"Code0
Show:102550
← PrevPage 36 of 64Next →

No leaderboard results yet.