SOTAVerified

Math

Papers

Showing 601650 of 1596 papers

TitleStatusHype
Personalized Exercise Recommendation with Semantically-Grounded Knowledge TracingCode0
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt TuningCode0
Decomposing Elements of Problem Solving: What "Math" Does RL Teach?Code0
Analysis of Optimization Algorithms via Sum-of-SquaresCode0
OntoMath^PRO Ontology: A Linked Data Hub for MathematicsCode0
Automatic Generation of Headlines for Online Math QuestionsCode0
Analogical Math Word Problems Solving with Enhanced Problem-Solution AssociationCode0
HAPO: Training Language Models to Reason Concisely via History-Aware Policy OptimizationCode0
NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language ModelsCode0
Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math ReasoningCode0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning TasksCode0
Neural Machine Translation and Sequence-to-sequence Models: A TutorialCode0
A mixed policy to improve performance of language models on math problemsCode0
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative QueryingCode0
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge TracingCode0
A Meaning-based Statistical English Math Word Problem SolverCode0
Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression RecognitionCode0
Augmented Math: Authoring AR-Based Explorable Explanations by Augmenting Static Math TextbooksCode0
Modeling Intra-Relation in Math Word Problems with Different Functional Multi-Head AttentionsCode0
Continual Pre-training of Language Models for Math Problem Understanding with Syntax-Aware Memory NetworkCode0
MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMsCode0
Mixed-R1: Unified Reward Perspective For Reasoning Capability in Multimodal Large Language ModelsCode0
MMATH: A Multilingual Benchmark for Mathematical ReasoningCode0
More is More: Addition Bias in Large Language ModelsCode0
ATHENA: Mathematical Reasoning with Thought ExpansionCode0
Mind Scramble: Unveiling Large Language Model Psychology Via TypoglycemiaCode0
MIRB: Mathematical Information Retrieval BenchmarkCode0
Meta-Reasoning Improves Tool Use in Large Language ModelsCode0
How Should We Enhance the Safety of Large Reasoning Models: An Empirical StudyCode0
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled BenchmarkCode0
metboost: Exploratory regression analysis with hierarchically clustered dataCode0
How Do Humans Write Code? Large Models Do It the Same Way TooCode0
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning ModelsCode0
Misplaced Trust: Measuring the Interference of Machine Learning in Human Decision-MakingCode0
mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language ModelsCode0
MAWPS: A Math Word Problem RepositoryCode0
Heteroclinic cycling and extinction in May-Leonard models with demographic stochasticityCode0
ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak SupervisionCode0
Math Word Problem Solving by Generating Linguistic Variants of Problem StatementsCode0
Algebra Error Classification with Large Language ModelsCode0
Helpful assistant or fruitful facilitator? Investigating how personas affect language model behaviorCode0
ASyMOB: Algebraic Symbolic Mathematical Operations BenchmarkCode0
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical ReasoningCode0
Computationally Identifying Funneling and Focusing Questions in Classroom DiscourseCode0
MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical BenchmarkCode0
Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal ModelsCode0
Compositional Processing Emerges in Neural Networks Solving Math ProblemsCode0
MathScale: Scaling Instruction Tuning for Mathematical ReasoningCode0
HARDMath2: A Benchmark for Applied Mathematics Built by Students as Part of a Graduate ClassCode0
Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition ExtractionCode0
Show:102550
← PrevPage 13 of 32Next →

No leaderboard results yet.