SOTAVerified

Math

Papers

Showing 201225 of 1596 papers

TitleStatusHype
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical TextsCode2
Can AI Assistants Know What They Don't Know?Code2
SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in ChineseCode2
Tuning Language Models by ProxyCode2
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language ModelsCode2
MathPile: A Billion-Token-Scale Pretraining Corpus for MathCode2
YUAN 2.0: A Large Language Model with Localized Filtering-based AttentionCode2
Meta Prompting for AI SystemsCode2
System 2 Attention (is something you might need too)Code2
Agent Lumos: Unified and Modular Training for Open-Source Language AgentsCode2
An Expression Tree Decoding Strategy for Mathematical Equation GenerationCode2
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math ReasoningCode2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language ModelsCode2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical ReasoningCode2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual ContextsCode2
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMsCode2
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language ModelsCode2
MAmmoTH: Building Math Generalist Models through Hybrid Instruction TuningCode2
GPT Can Solve Mathematical Problems Without a CalculatorCode2
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-VerificationCode2
Cumulative Reasoning with Large Language ModelsCode2
MM-Vet: Evaluating Large Multimodal Models for Integrated CapabilitiesCode2
LeanDojo: Theorem Proving with Retrieval-Augmented Language ModelsCode2
Progressive-Hint Prompting Improves Reasoning in Large Language ModelsCode2
AGIEval: A Human-Centric Benchmark for Evaluating Foundation ModelsCode2
Show:102550
← PrevPage 9 of 64Next →

No leaderboard results yet.