SOTAVerified

Math

Papers

Showing 13511400 of 1596 papers

TitleStatusHype
Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level TrainingCode0
DIVE: Diversified Iterative Self-ImprovementCode0
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-ProblemsCode0
Modeling Intra-Relation in Math Word Problems with Different Functional Multi-Head AttentionsCode0
Learning Non-linguistic Skills without Sacrificing Linguistic ProficiencyCode0
EquivPruner: Boosting Efficiency and Quality in LLM-Based Search via Action PruningCode0
ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak SupervisionCode0
Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word ProblemsCode0
Towards Infinite-Long Prefix in TransformerCode0
An Independent Evaluation of ChatGPT on Mathematical Word Problems (MWP)Code0
Faithful Chain-of-Thought ReasoningCode0
Techniques to Improve Neural Math Word Problem SolversCode0
DyRRen: A Dynamic Retriever-Reranker-Generator Model for Numerical Reasoning over Tabular and Textual DataCode0
Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning ProcessCode0
More is More: Addition Bias in Large Language ModelsCode0
SEGO: Sequential Subgoal Optimization for Mathematical Problem-SolvingCode0
Decomposing Elements of Problem Solving: What "Math" Does RL Teach?Code0
A Goal-Driven Tree-Structured Neural Model for Math Word ProblemsCode0
Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word ProblemsCode0
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination EvaluationCode0
Prover-Verifier Games improve legibility of LLM outputsCode0
Towards a Deeper Understanding of Reasoning Capabilities in Large Language ModelsCode0
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning ModelsCode0
FINNger -- Applying artificial intelligence to ease math learning for childrenCode0
Multi-Scale Attention with Dense Encoder for Handwritten Mathematical Expression RecognitionCode0
Leveraging Label Semantics and Meta-Label Refinement for Multi-Label Question ClassificationCode0
PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt TuningCode0
An algorithm to represent inbreeding treesCode0
What Makes Math Word Problems Challenging for LLMs?Code0
Leveraging Training Data in Few-Shot Prompting for Numerical ReasoningCode0
AgentSwift: Efficient LLM Agent Design via Value-guided Hierarchical SearchCode0
Leveraging Web-Crawled Data for High-Quality Fine-TuningCode0
StepMathAgent: A Step-Wise Agent for Evaluating Mathematical Processes through Tree-of-ErrorCode0
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language ModelsCode0
Library Learning Doesn't: The Curious Case of the Single-Use "Library"Code0
AutoMSC: Automatic Assignment of Mathematics Subject Classification LabelsCode0
From Euler to AI: Unifying Formulas for Mathematical ConstantsCode0
A safety realignment framework via subspace-oriented model fusion for large language modelsCode0
TreeRPO: Tree Relative Policy OptimizationCode0
A large language model-assisted education tool to provide feedback on open-ended responsesCode0
Linguistic Generalizability of Test-Time Scaling in Mathematical ReasoningCode0
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice QuestionsCode0
Automatic Short Math Answer Grading via In-context Meta-learningCode0
The Matrix Calculus You Need For Deep LearningCode0
An extrapolated and provably convergent algorithm for nonlinear matrix decomposition with the ReLU functionCode0
Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model TutorsCode0
Llama SLayer 8B: Shallow Layers Hold the Key to Knowledge InjectionCode0
Taxonomy of Mathematical PlagiarismCode0
LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM EvaluationCode0
GATE: Graph-based Adaptive Tool Evolution Across Diverse TasksCode0
Show:102550
← PrevPage 28 of 32Next →

No leaderboard results yet.