SOTAVerified

Math

Papers

Showing 301350 of 1596 papers

TitleStatusHype
Automatic Generation of Socratic Subquestions for Teaching Math Word ProblemsCode1
Mathematical Capabilities of ChatGPTCode1
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human AnnotationsCode1
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language ModelsCode1
A Diverse Corpus for Evaluating and Developing English Math Word Problem SolversCode1
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language ModelsCode1
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for ReasoningCode1
M1: Towards Scalable Test-Time Compute with Mamba Reasoning ModelsCode1
MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics EducationCode1
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy PreservationCode1
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language ModelsCode1
LoRA Soups: Merging LoRAs for Practical Skill Composition TasksCode1
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language ModelsCode1
Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant EvaluationCode1
Let's Verify Math Questions Step by StepCode1
LEVER: Learning to Verify Language-to-Code Generation with ExecutionCode1
AutoBencher: Creating Salient, Novel, Difficult Datasets for Language ModelsCode1
Augmenting Math Word Problems via Iterative Question ComposingCode1
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual DependencyCode1
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMsCode1
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct SolutionsCode1
Learning by Fixing: Solving Math Word Problems with Weak SupervisionCode1
Learning Goal-Conditioned Representations for Language Reward ModelsCode1
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed BanditsCode1
A Tree-Structured Decoder for Image-to-Markup GenerationCode1
DICE: Detecting In-distribution Contamination in LLM's Fine-tuning Phase for Math ReasoningCode1
Large (Vision) Language Models are Unsupervised In-Context LearnersCode1
Learning Multi-Step Reasoning by Solving Arithmetic TasksCode1
Language Models Encode the Value of Numbers LinearlyCode1
A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource SettingsCode1
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context LearningCode1
Conic10K: A Challenging Math Problem Understanding and Reasoning DatasetCode1
Control LLM: Controlled Evolution for Intelligence Retention in LLMCode1
Learning From Mistakes Makes LLM Better ReasonerCode1
Language Models as Science TutorsCode1
Large Language Models Are Neurosymbolic ReasonersCode1
Non-myopic Generation of Language Models for Reasoning and PlanningCode1
Design of Chain-of-Thought in Math Problem SolvingCode1
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language ModelsCode1
CoT-based Synthesizer: Enhancing LLM Performance through Answer SynthesisCode1
Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic ConsistencyCode1
Discovering Mathematical Objects of Interest -- A Study of Mathematical NotationsCode1
Large Language Models Can Be Easily Distracted by Irrelevant ContextCode1
Learning to Reason Deductively: Math Word Problem Solving as Complex Relation ExtractionCode1
MathViz-E: A Case-study in Domain-Specialized Tool-Using AgentsCode1
Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom InstructionCode1
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language ModelsCode1
Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning CapabilityCode1
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis ModelsCode1
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem UnderstandingCode1
Show:102550
← PrevPage 7 of 32Next →

No leaderboard results yet.