SOTAVerified

Math

Papers

Showing 11511200 of 1596 papers

TitleStatusHype
RevOrder: A Novel Method for Enhanced Arithmetic in Language Models0
Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision0
Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation0
Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on Learning With Errors0
Large Language Models for Mathematical Reasoning: Progresses and Challenges0
Efficient Tool Use with Chain-of-Abstraction Reasoning0
Taxonomy of Mathematical PlagiarismCode0
GAPS: Geometry-Aware Problem Solver0
YODA: Teacher-Student Progressive Learning for Language Models0
Exploring Educational Equity: A Machine Learning Approach to Unravel Achievement Disparities in Georgia0
Using Java Geometry Expert as Guide in the Preparations for Math Contests0
Self-Imagine: Effective Unimodal Reasoning with Multimodal Models using Self-Imagination0
CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities0
Cramer-Rao bound and absolute sensitivity in chemical reaction networks0
Using Large Language Models to Assess Tutors' Performance in Reacting to Students Making Math Errors0
Graph2Tac: Online Representation Learning of Formal Math Concepts0
Mastery Guided Non-parametric Clustering to Scale-up Strategy Prediction0
Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities0
From Good to Great: Improving Math Reasoning with Tool-Augmented Interleaf Prompting0
TinyGSM: achieving >80% on GSM8k with small language models0
Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning0
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models0
LaRS: Latent Reasoning Skills for Chain-of-Thought Reasoning0
ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math QuestionsCode0
REDS: Resource-Efficient Deep Subnetworks for Dynamic Resource Constraints0
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning0
SAIE Framework: Support Alone Isn't Enough -- Advancing LLM Training with Adversarial Remarks0
VerityMath: Advancing Mathematical Reasoning by Self-Verification Through Unit ConsistencyCode0
Large Language Models' Understanding of Math: Source Criticism and Extrapolation0
Let's Reinforce Step by Step0
Towards Interpretable Sequence Continuation: Analyzing Shared Circuits in Large Language ModelsCode0
Enhancing LLM Intelligence with ARM-RAG: Auxiliary Rationale Memory for Retrieval Augmented Generation0
ATHENA: Mathematical Reasoning with Thought ExpansionCode0
Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem SolvingCode0
Exploring the Reliability of Large Language Models as Customized Evaluators for Diverse NLP TasksCode0
math-PVS: A Large Language Model Framework to Map Scientific Publications to PVS Theories0
We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic FieldsCode0
SEGO: Sequential Subgoal Optimization for Mathematical Problem-SolvingCode0
Let's reward step by step: Step-Level reward model as the Navigators for Reasoning0
Improving Large Language Model Fine-tuning for Solving Math Problems0
Solving Math Word Problems with ReexaminationCode0
The Search-and-Mix Paradigm in Approximate Nash Equilibrium Algorithms0
LLMs as Potential Brainstorming Partners for Math and Science Problems0
Guiding Language Model Reasoning with Planning Tokens0
Critique Ability of Large Language Models0
Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models0
Analysis of the Reasoning with Redundant Information Provided Ability of Large Language Models0
Concise and Organized Perception Facilitates Reasoning in Large Language Models0
The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing PracticesCode0
Large Language Models as Analogical Reasoners0
Show:102550
← PrevPage 24 of 32Next →

No leaderboard results yet.