SOTAVerified

Math

Papers

Showing 401425 of 1596 papers

TitleStatusHype
Collective Constitutional AI: Aligning a Language Model with Public InputCode1
A Categorical Archive of ChatGPT FailuresCode1
NeMo-Inspector: A Visualization Tool for LLM Generation AnalysisCode1
NLPBench: Evaluating Large Language Models on Solving NLP ProblemsCode1
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof DataCode1
A Relation Spectrum Inheriting Taylor Series: Muscle Synergy and Coupling for HandCode1
MWP-BERT: Numeracy-Augmented Pre-training for Math Word Problem SolvingCode1
Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation CapabilitiesCode1
Evolving Prompts In-Context: An Open-ended, Self-replicating PerspectiveCode1
Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational CurriculaCode1
Multiple-Choice Questions are Efficient and Robust LLM EvaluatorsCode1
MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem SolversCode1
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgentCode1
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized DocumentsCode1
Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step ReasoningCode1
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability TreesCode1
EXAONE Deep: Reasoning Enhanced Language ModelsCode1
Entropy-Regularized Process Reward ModelCode1
Entropy-Based Adaptive Weighting for Self-TrainingCode1
CLEVR-Math: A Dataset for Compositional Language, Visual and Mathematical ReasoningCode1
Evaluating and Improving Tool-Augmented Computation-Intensive Math ReasoningCode1
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World ChallengesCode1
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language ModelsCode1
Natural Language Embedded Programs for Hybrid Language Symbolic ReasoningCode1
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMsCode1
Show:102550
← PrevPage 17 of 64Next →

No leaderboard results yet.