SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–425 of 1596 papers

Title	Date	Tasks	Status	Hype
Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions	Jan 17, 2024	Arithmetic ReasoningCode Generation	CodeCode Available	1
Large Language Models Are Neurosymbolic Reasoners	Jan 17, 2024	Common Sense ReasoningMath	CodeCode Available	1
Augmenting Math Word Problems via Iterative Question Composing	Jan 17, 2024	MathMathematical Reasoning	CodeCode Available	1
The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models	Jan 11, 2024	MathMultiple-choice	CodeCode Available	1
Language Models Encode the Value of Numbers Linearly	Jan 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation	Dec 28, 2023	GSM8KLanguage Model Evaluation	CodeCode Available	1
An In-depth Look at Gemini's Language Abilities	Dec 18, 2023	Instruction FollowingMath	CodeCode Available	1
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent	Dec 14, 2023	Language ModelingLanguage Modelling	CodeCode Available	1
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations	Dec 14, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	1
Get an A in Math: Progressive Rectification Prompting	Dec 11, 2023	Math	CodeCode Available	1
Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers	Dec 7, 2023	MathMultiple-choice	CodeCode Available	1
Eliciting Latent Knowledge from Quirky Language Models	Dec 2, 2023	Anomaly DetectionMath	CodeCode Available	1
MathGloss: Building mathematical glossaries from text	Nov 21, 2023	Math	CodeCode Available	1
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Long and Specialized Documents	Nov 16, 2023	Math	CodeCode Available	1
FinanceMath: Knowledge-Intensive Math Reasoning in Finance Domains	Nov 16, 2023	MathMath Word Problem Solving	CodeCode Available	1
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving	Nov 15, 2023	Math	CodeCode Available	1
Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration	Nov 14, 2023	DiversityMath	CodeCode Available	1
Conic10K: A Challenging Math Problem Understanding and Reasoning Dataset	Nov 9, 2023	MathNatural Language Understanding	CodeCode Available	1
Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs	Nov 8, 2023	FairnessMath	CodeCode Available	1
Implicit Chain of Thought Reasoning via Knowledge Distillation	Nov 2, 2023	Knowledge DistillationMath	CodeCode Available	1
Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations	Oct 31, 2023	GSM8KMath	CodeCode Available	1
Learning From Mistakes Makes LLM Better Reasoner	Oct 31, 2023	GSM8KMath	CodeCode Available	1
An Early Evaluation of GPT-4V(ision)	Oct 25, 2023	Math	CodeCode Available	1
Expression Syntax Information Bottleneck for Math Word Problems	Oct 24, 2023	Math	CodeCode Available	1
Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts	Oct 23, 2023	Logical ReasoningMath	CodeCode Available	1

Show:10 25 50

← PrevPage 17 of 64Next →

No leaderboard results yet.