SOTAVerified|Agents Browse Leaderboard About

Math

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–225 of 1596 papers

Title	Date	Tasks	Status	Hype
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts	Feb 12, 2024	Continual PretrainingGSM8K	CodeCode Available	2
Can AI Assistants Know What They Don't Know?	Jan 24, 2024	MathOpen-Domain Question Answering	CodeCode Available	2
SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese	Jan 22, 2024	DiversityGSM8K	CodeCode Available	2
Tuning Language Models by Proxy	Jan 16, 2024	Domain AdaptationMath	CodeCode Available	2
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models	Jan 15, 2024	MathMathematical Reasoning	CodeCode Available	2
MathPile: A Billion-Token-Scale Pretraining Corpus for Math	Dec 28, 2023	Language IdentificationMath	CodeCode Available	2
YUAN 2.0: A Large Language Model with Localized Filtering-based Attention	Nov 27, 2023	Code GenerationLanguage Modeling	CodeCode Available	2
Meta Prompting for AI Systems	Nov 20, 2023	Data InteractionGSM8K	CodeCode Available	2
System 2 Attention (is something you might need too)	Nov 20, 2023	Math	CodeCode Available	2
Agent Lumos: Unified and Modular Training for Open-Source Language Agents	Nov 9, 2023	MathQuestion Answering	CodeCode Available	2
An Expression Tree Decoding Strategy for Mathematical Equation Generation	Oct 14, 2023	MathMathematical Reasoning	CodeCode Available	2
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning	Oct 9, 2023	Arithmetic ReasoningData Augmentation	CodeCode Available	2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models	Oct 6, 2023	Code GenerationDecision Making	CodeCode Available	2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning	Oct 5, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts	Oct 3, 2023	ChatbotImage Captioning	CodeCode Available	2
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs	Sep 22, 2023	Math	CodeCode Available	2
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models	Sep 21, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning	Sep 11, 2023	MathMathematical Reasoning	CodeCode Available	2
GPT Can Solve Mathematical Problems Without a Calculator	Sep 6, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification	Aug 15, 2023	Arithmetic ReasoningMath	CodeCode Available	2
Cumulative Reasoning with Large Language Models	Aug 8, 2023	Decision MakingLogical Reasoning	CodeCode Available	2
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities	Aug 4, 2023	MathMM-Vet	CodeCode Available	2
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models	Jun 27, 2023	Automated Theorem ProvingGPU	CodeCode Available	2
Progressive-Hint Prompting Improves Reasoning in Large Language Models	Apr 19, 2023	Arithmetic ReasoningGSM8K	CodeCode Available	2
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models	Apr 13, 2023	Decision MakingMath	CodeCode Available	2

Show:10 25 50

← PrevPage 9 of 64Next →

No leaderboard results yet.