SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 976–1000 of 1596 papers

Title	Date	Tasks	Status	Hype
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision	Feb 5, 2024	GSM8KMath	—Unverified	0
Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation	Feb 4, 2024	HallucinationMath	—Unverified	0
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models	Feb 2, 2024	Language ModellingLarge Language Model	CodeCode Available	1
Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on Learning With Errors	Feb 2, 2024	Math	—Unverified	0
Large Language Models for Mathematical Reasoning: Progresses and Challenges	Jan 31, 2024	DiversityMath	—Unverified	0
Efficient Tool Use with Chain-of-Abstraction Reasoning	Jan 30, 2024	MathMathematical Reasoning	—Unverified	0
Taxonomy of Mathematical Plagiarism	Jan 30, 2024	MathQuestion Answering	CodeCode Available	0
ReGAL: Refactoring Programs to Discover Generalizable Abstractions	Jan 29, 2024	Date UnderstandingMath	CodeCode Available	1
GAPS: Geometry-Aware Problem Solver	Jan 29, 2024	Geometry Problem SolvingMath	—Unverified	0
YODA: Teacher-Student Progressive Learning for Language Models	Jan 28, 2024	GSM8KMath	—Unverified	0
Exploring Educational Equity: A Machine Learning Approach to Unravel Achievement Disparities in Georgia	Jan 25, 2024	Math	—Unverified	0
Can AI Assistants Know What They Don't Know?	Jan 24, 2024	MathOpen-Domain Question Answering	CodeCode Available	2
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks	Jan 23, 2024	MathQuestion Answering	CodeCode Available	1
Using Java Geometry Expert as Guide in the Preparations for Math Contests	Jan 22, 2024	Math	—Unverified	0
SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese	Jan 22, 2024	DiversityGSM8K	CodeCode Available	2
Over-Reasoning and Redundant Calculation of Large Language Models	Jan 21, 2024	GSM8KMath	CodeCode Available	1
Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning	Jan 19, 2024	GSM8KMath	CodeCode Available	1
Augmenting Math Word Problems via Iterative Question Composing	Jan 17, 2024	MathMathematical Reasoning	CodeCode Available	1
Large Language Models Are Neurosymbolic Reasoners	Jan 17, 2024	Common Sense ReasoningMath	CodeCode Available	1
ReFT: Reasoning with Reinforced Fine-Tuning	Jan 17, 2024	GSM8KMath	CodeCode Available	4
Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions	Jan 17, 2024	Arithmetic ReasoningCode Generation	CodeCode Available	1
Tuning Language Models by Proxy	Jan 16, 2024	Domain AdaptationMath	CodeCode Available	2
Self-Imagine: Effective Unimodal Reasoning with Multimodal Models using Self-Imagination	Jan 16, 2024	GSM8KLanguage Modeling	—Unverified	0
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline	Jan 16, 2024	GSM8KMath	CodeCode Available	3
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models	Jan 15, 2024	MathMathematical Reasoning	CodeCode Available	2

Show:10 25 50

← PrevPage 40 of 64Next →

No leaderboard results yet.