SOTAVerified

Math

Papers

Showing 9511000 of 1596 papers

TitleStatusHype
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task ArithmeticCode2
Reformatted AlignmentCode2
LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks0
Orca-Math: Unlocking the potential of SLMs in Grade School Math0
Language Models as Science TutorsCode1
Language Models with Conformal Factuality Guarantees0
Mathematical Opportunities in Digital Twins (MATH-DT)0
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning DatasetCode4
GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-SolvingCode1
AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and GuardrailsCode0
Towards better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications0
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof DataCode1
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements0
EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages0
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language ModelsCode3
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical TextsCode2
Understanding the Progression of Educational Topics via Semantic Matching0
InternLM-Math: Open Math Large Language Models Toward Verifiable ReasoningCode4
V-STaR: Training Verifiers for Self-Taught Reasoners0
Noise Contrastive Alignment of Language Models with Explicit RewardsCode3
In-Context Principle Learning from MistakesCode0
Self-Discover: Large Language Models Self-Compose Reasoning StructuresCode3
RevOrder: A Novel Method for Enhanced Arithmetic in Language Models0
Understanding Reasoning Ability of Language Models From the Perspective of Reasoning Paths AggregationCode1
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language ModelsCode9
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision0
Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation0
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language ModelsCode1
Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on Learning With Errors0
Large Language Models for Mathematical Reasoning: Progresses and Challenges0
Efficient Tool Use with Chain-of-Abstraction Reasoning0
Taxonomy of Mathematical PlagiarismCode0
ReGAL: Refactoring Programs to Discover Generalizable AbstractionsCode1
GAPS: Geometry-Aware Problem Solver0
YODA: Teacher-Student Progressive Learning for Language Models0
Exploring Educational Equity: A Machine Learning Approach to Unravel Achievement Disparities in Georgia0
Can AI Assistants Know What They Don't Know?Code2
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic TasksCode1
Using Java Geometry Expert as Guide in the Preparations for Math Contests0
SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in ChineseCode2
Over-Reasoning and Redundant Calculation of Large Language ModelsCode1
Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step ReasoningCode1
Augmenting Math Word Problems via Iterative Question ComposingCode1
Large Language Models Are Neurosymbolic ReasonersCode1
ReFT: Reasoning with Reinforced Fine-TuningCode4
Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided InterventionsCode1
Tuning Language Models by ProxyCode2
Self-Imagine: Effective Unimodal Reasoning with Multimodal Models using Self-Imagination0
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible PipelineCode3
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language ModelsCode2
Show:102550
← PrevPage 20 of 32Next →

No leaderboard results yet.