SOTAVerified

Math

Papers

Showing 14511500 of 1596 papers

TitleStatusHype
Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math Information RetrievalCode0
MAMUT: A Novel Framework for Modifying Mathematical Formulas for the Generation of Specialized Datasets for Language Model TrainingCode0
Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal ModelsCode0
Effects of structure on reasoning in instance-level Self-DiscoverCode0
Mapping to Declarative Knowledge for Word Problem SolvingCode0
NUMCoT: Numerals and Units of Measurement in Chain-of-Thought Reasoning using Large Language ModelsCode0
MARGE: Improving Math Reasoning for LLMs with Guided ExplorationCode0
Helpful assistant or fruitful facilitator? Investigating how personas affect language model behaviorCode0
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling EvaluatorsCode0
Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge TracingCode0
Efficient Non-Parametric Optimizer Search for Diverse TasksCode0
Heteroclinic cycling and extinction in May-Leonard models with demographic stochasticityCode0
Deterministic and Nondeterministic Particle Motion with Interaction MechanismsCode0
ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem SolvingCode0
LM^2: A Simple Society of Language Models Solves Complex ReasoningCode0
AALC: Large Language Model Efficient Reasoning via Adaptive Accuracy-Length ControlCode0
Textual Enhanced Contrastive Learning for Solving Math Word ProblemsCode0
ReCUT: Balancing Reasoning Length and Accuracy in LLMs via Stepwise Trails and Preference OptimizationCode0
How Do Humans Write Code? Large Models Do It the Same Way TooCode0
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration PitfallsCode0
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled BenchmarkCode0
How Should We Enhance the Safety of Large Reasoning Models: An Empirical StudyCode0
World Models for Math Story ProblemsCode0
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning TasksCode0
ChatBench: From Static Benchmarks to Human-AI EvaluationCode0
Augmented Math: Authoring AR-Based Explorable Explanations by Augmenting Static Math TextbooksCode0
When an LLM is apprehensive about its answers -- and when its uncertainty is justifiedCode0
Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions?Code0
Skellam Mixture Mechanism: a Novel Approach to Federated Learning with Differential PrivacyCode0
Classifying Math KCs via Task-Adaptive Pre-Trained BERTCode0
Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem SolvingCode0
ATHENA: Mathematical Reasoning with Thought ExpansionCode0
DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical CorrectionCode0
Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge DistillationCode0
Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive FailureCode0
Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained SettingsCode0
Analysis of Optimization Algorithms via Sum-of-SquaresCode0
Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic ReasoningCode0
Improving Compositional Generalization in Math Word Problem SolvingCode0
Mathematical Reasoning in Large Language Models: Assessing Logical and Arithmetic Errors across Wide Numerical RangesCode0
Mathematics Content Understanding for Cyberlearning via Formula Evolution MapCode0
Analogical Math Word Problems Solving with Enhanced Problem-Solution AssociationCode0
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative QueryingCode0
Small Language Models Need Strong Verifiers to Self-Correct ReasoningCode0
SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small ModelsCode0
OntoMath^PRO Ontology: A Linked Data Hub for MathematicsCode0
Enumerate-Conjecture-Prove: Formally Solving Answer-Construction Problems in Math CompetitionsCode0
In-Context Principle Learning from MistakesCode0
Incorporating Graph Attention Mechanism into Geometric Problem Solving Based on Deep Reinforcement LearningCode0
Smart Vision-Language ReasonersCode0
Show:102550
← PrevPage 30 of 32Next →

No leaderboard results yet.