SOTAVerified

Math

Papers

Showing 451475 of 1596 papers

TitleStatusHype
Pretrained Language Models are Symbolic Mathematics Solvers too!Code1
Reasoning with Reinforced Functional Token TuningCode1
MathViz-E: A Case-study in Domain-Specialized Tool-Using AgentsCode1
Eliciting Latent Knowledge from Quirky Language ModelsCode1
PECC: Problem Extraction and Coding ChallengesCode1
Eliminating Position Bias of Language Models: A Mechanistic ApproachCode1
Explaining Datasets in Words: Statistical Models with Natural Language ParametersCode1
Aioli: A Unified Optimization Framework for Language Model Data MixingCode1
CityGPT: Empowering Urban Spatial Cognition of Large Language ModelsCode1
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty AgentsCode1
Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and ObservationsCode1
Evolving Prompts In-Context: An Open-ended, Self-replicating PerspectiveCode1
EXAONE Deep: Reasoning Enhanced Language ModelsCode1
Over-Reasoning and Redundant Calculation of Large Language ModelsCode1
ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language ModelsCode1
ArMATH: a Dataset for Solving Arabic Math Word ProblemsCode1
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical MappingCode1
Ape210K: A Large-Scale and Template-Rich Dataset of Math Word ProblemsCode1
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of HeuristicsCode1
Mathfish: Evaluating Language Model Math Reasoning via Grounding in Educational CurriculaCode1
Expression Syntax Information Bottleneck for Math Word ProblemsCode1
Pairwise RM: Perform Best-of-N Sampling with Knockout TournamentCode1
Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-ThoughtsCode1
OJBench: A Competition Level Code Benchmark For Large Language ModelsCode1
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model EvaluationCode1
Show:102550
← PrevPage 19 of 64Next →

No leaderboard results yet.