SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1001–1025 of 1596 papers

Title	Date	Tasks	Status	Hype
Cramer-Rao bound and absolute sensitivity in chemical reaction networks	Jan 13, 2024	MathSensitivity	—Unverified	0
CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities	Jan 13, 2024	MathMathematical Reasoning	—Unverified	0
The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models	Jan 11, 2024	MathMultiple-choice	CodeCode Available	1
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation	Jan 9, 2024	GPUMath	CodeCode Available	3
Language Models Encode the Value of Numbers Linearly	Jan 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Using Large Language Models to Assess Tutors' Performance in Reacting to Students Making Math Errors	Jan 6, 2024	Math	—Unverified	0
Graph2Tac: Online Representation Learning of Formal Math Concepts	Jan 5, 2024	AI AgentAutomated Theorem Proving	—Unverified	0
Mastery Guided Non-parametric Clustering to Scale-up Strategy Prediction	Jan 4, 2024	ClusteringFairness	—Unverified	0
LLaMA Pro: Progressive LLaMA with Block Expansion	Jan 4, 2024	Instruction FollowingMath	CodeCode Available	4
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation	Dec 28, 2023	GSM8KLanguage Model Evaluation	CodeCode Available	1
MathPile: A Billion-Token-Scale Pretraining Corpus for Math	Dec 28, 2023	Language IdentificationMath	CodeCode Available	2
Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities	Dec 22, 2023	ChatbotGSM8K	—Unverified	0
From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting	Dec 18, 2023	DiversityGSM8K	—Unverified	0
An In-depth Look at Gemini's Language Abilities	Dec 18, 2023	Instruction FollowingMath	CodeCode Available	1
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent	Dec 14, 2023	Language ModelingLanguage Modelling	CodeCode Available	1
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations	Dec 14, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	1
TinyGSM: achieving >80% on GSM8k with small language models	Dec 14, 2023	Arithmetic ReasoningGSM8K	—Unverified	0
Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning	Dec 14, 2023	Arithmetic ReasoningFew-Shot Learning	—Unverified	0
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models	Dec 11, 2023	DiversityMath	—Unverified	0
Get an A in Math: Progressive Rectification Prompting	Dec 11, 2023	Math	CodeCode Available	1
LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning	Dec 7, 2023	In-Context LearningMath	—Unverified	0
Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers	Dec 7, 2023	MathMultiple-choice	CodeCode Available	1
ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions	Dec 4, 2023	Arithmetic ReasoningMath	CodeCode Available	0
Eliciting Latent Knowledge from Quirky Language Models	Dec 2, 2023	Anomaly DetectionMath	CodeCode Available	1
YUAN 2.0: A Large Language Model with Localized Filtering-based Attention	Nov 27, 2023	Code GenerationLanguage Modeling	CodeCode Available	2

Show:10 25 50

← PrevPage 41 of 64Next →

No leaderboard results yet.