SOTAVerified

Math

Papers

Showing 276300 of 1596 papers

TitleStatusHype
MathPrompter: Mathematical Reasoning using Large Language ModelsCode1
Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and LayersCode1
Forgotten Polygons: Multimodal Large Language Models are Shape-BlindCode1
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical ReasoningCode1
FINEREASON: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle SolvingCode1
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward PassesCode1
FormulaNet: A Benchmark Dataset for Mathematical Formula DetectionCode1
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit GenerationCode1
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human AnnotationsCode1
Math Word Problem Solving with Explicit Numerical ValuesCode1
A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human LevelCode1
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning ProblemsCode1
Advancing Multimodal Reasoning via Reinforcement Learning with Cold StartCode1
Expression Syntax Information Bottleneck for Math Word ProblemsCode1
Mathematical Capabilities of ChatGPTCode1
OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization ModelingCode1
Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPTCode1
MathChat: Converse to Tackle Challenging Math Problems with LLM AgentsCode1
MathGloss: Building mathematical glossaries from textCode1
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree SearchCode1
FELM: Benchmarking Factuality Evaluation of Large Language ModelsCode1
EXAONE Deep: Reasoning Enhanced Language ModelsCode1
Explaining Datasets in Words: Statistical Models with Natural Language ParametersCode1
An Early Evaluation of GPT-4V(ision)Code1
Evolving Prompts In-Context: An Open-ended, Self-replicating PerspectiveCode1
Show:102550
← PrevPage 12 of 64Next →

No leaderboard results yet.