SOTAVerified

Math

Papers

Showing 10511075 of 1596 papers

TitleStatusHype
MATHion: Solving Math Word Problems with Logically Consistent Problems0
Towards Tractable Mathematical Reasoning: Challenges, Strategies, and Opportunities for Solving Math Word Problems0
A Theme-Rewriting Approach for Generating Algebra Word Problems0
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration0
Towards Trustworthy AutoGrading of Short, Multi-lingual, Multi-type Answers0
Atari games and Intel processors0
Math Operation Embeddings for Open-ended Solution Analysis and Feedback0
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations0
MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection0
Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition0
math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories0
MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms0
Math Search for the Masses: Multimodal Search Interfaces and Appearance-Based Retrieval0
MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education0
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?0
A Tag-based English Math Word Problem Solver with Understanding, Reasoning and Explanation0
When Dimensionality Reduction Meets Graph (Drawing) Theory: Introducing a Common Framework, Challenges and Opportunities0
When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs0
Math Word Problem Generation with Mathematical Consistency and Problem Context Constraints0
Matryoshka Model Learning for Improved Elastic Student Models0
Asymptotic expression for the fixation probability of a mutant in star graphs0
Maximizing Confidence Alone Improves Reasoning0
MDIT: A Model-free Data Interpolation Method for Diverse Instruction Tuning0
Training Large Language Models to Reason via EM Policy Gradient0
Measurement to Meaning: A Validity-Centered Framework for AI Evaluation0
Show:102550
← PrevPage 43 of 64Next →

No leaderboard results yet.