SOTAVerified

Math

Papers

Showing 351400 of 1596 papers

TitleStatusHype
Conic10K: A Challenging Math Problem Understanding and Reasoning DatasetCode1
Learning to Reason Deductively: Math Word Problem Solving as Complex Relation ExtractionCode1
Let's Verify Math Questions Step by StepCode1
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct SolutionsCode1
Efficient Neural Theorem Proving via Fine-grained Proof Structure AnalysisCode1
Learning Goal-Conditioned Representations for Language Reward ModelsCode1
Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical ReasoningCode1
MathPrompter: Mathematical Reasoning using Large Language ModelsCode1
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language ModelsCode1
A Diverse Corpus for Evaluating and Developing English Math Word Problem SolversCode1
Dyve: Thinking Fast and Slow for Dynamic Process VerificationCode1
MathViz-E: A Case-study in Domain-Specialized Tool-Using AgentsCode1
Efficient Reasoning for LLMs through Speculative Chain-of-ThoughtCode1
Learning Multi-Step Reasoning by Solving Arithmetic TasksCode1
MathGloss: Building mathematical glossaries from textCode1
MWPToolkit: An Open-Source Framework for Deep Learning-Based Math Word Problem SolversCode1
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed BanditsCode1
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language ModelsCode1
Large Language Models Can Be Easily Distracted by Irrelevant ContextCode1
Memory-Efficient and Secure DNN Inference on TrustZone-enabled Consumer IoT DevicesCode1
Large (Vision) Language Models are Unsupervised In-Context LearnersCode1
Language Models Encode the Value of Numbers LinearlyCode1
Large Language Models Are Latent Variable Models: Explaining and Finding Good Demonstrations for In-Context LearningCode1
Language Models as Science TutorsCode1
Large Language Models Are Neurosymbolic ReasonersCode1
A Symbolic Character-Aware Model for Solving Geometry ProblemsCode1
Non-myopic Generation of Language Models for Reasoning and PlanningCode1
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World ChallengesCode1
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language ModelsCode1
Design and implementation of an environment for Learning to Run a Power Network (L2RPN)Code1
Kalman Filter Enhanced GRPO for Reinforcement Learning-Based Language Model ReasoningCode1
FinanceMath: Knowledge-Intensive Math Reasoning in Finance DomainsCode1
JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem UnderstandingCode1
CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical ReasoningCode1
Collective Constitutional AI: Aligning a Language Model with Public InputCode1
A Categorical Archive of ChatGPT FailuresCode1
Injecting Numerical Reasoning Skills into Language ModelsCode1
Implicit Chain of Thought Reasoning via Knowledge DistillationCode1
How well do Large Language Models perform in Arithmetic tasks?Code1
Improving the Validity of Automatically Generated Feedback via Reinforcement LearningCode1
Code-Vision: Evaluating Multimodal LLMs Logic Understanding and Code Generation CapabilitiesCode1
Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models Aligned with Human Cognitive PrinciplesCode1
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with AutoformalizationCode1
How to Get Your LLM to Generate Challenging Problems for EvaluationCode1
Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom InstructionCode1
HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM SystemsCode1
HARDMath: A Benchmark Dataset for Challenging Problems in Applied MathematicsCode1
Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM ReasoningCode1
Graph-to-Tree Learning for Solving Math Word ProblemsCode1
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical ReasoningCode1
Show:102550
← PrevPage 8 of 32Next →

No leaderboard results yet.