SOTAVerified

Math

Papers

Showing 11511200 of 1596 papers

TitleStatusHype
Interpretable Math Word Problem Solution Generation Via Step-by-step Planning0
Quantitative Methods for Optimizing Patient Outcomes in Liver Transplantation0
Let's Verify Step by StepCode4
Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard0
Leveraging Training Data in Few-Shot Prompting for Numerical ReasoningCode0
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective0
Emergent inabilities? Inverse scaling over the course of pretraining0
GRACE: Discriminator-Guided Chain-of-Thought ReasoningCode1
Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition ExtractionCode0
Reasoning with Language Model is Planning with World ModelCode4
Unlocking Temporal Question Answering for Large Language Models with Tailor-Made Reasoning LogicCode0
The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language ModelsCode1
RSRM: Reinforcement Symbolic Regression Machine0
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning ProblemsCode1
RetICL: Sequential Retrieval of In-Context Examples with Reinforcement LearningCode1
ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language ModelsCode1
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language ModelsCode1
Cognitive network science reveals bias in GPT-3, ChatGPT, and GPT-4 mirroring math anxiety in high-school students0
Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation0
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate0
TEIMMA: The First Content Reuse Annotator for Text, Images, and MathCode0
TheoremQA: A Theorem-driven Question Answering datasetCode1
Hint of Thought prompting: an explainable and zero-shot approach to reasoning tasks with LLMs0
A quantitative study of NLP approaches to question difficulty estimationCode0
Learning Non-linguistic Skills without Sacrificing Linguistic ProficiencyCode0
CodeT5+: Open Code Large Language Models for Code Understanding and GenerationCode0
Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces0
Algebra Error Classification with Large Language ModelsCode0
Non-Autoregressive Math Word Problem Solver with Unified Tree StructureCode1
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language ModelsCode7
AI, write an essay for me: A large-scale comparison of human-written versus ChatGPT-generated essays0
Who's the Best Detective? LLMs vs. MLs in Detecting Incoherent Fourth Grade Math Answers0
Progressive-Hint Prompting Improves Reasoning in Large Language ModelsCode2
Enhancing Textbooks with Visuals from the Web for Improved LearningCode0
Metric-agnostic Ranking Optimization0
What Makes a Good Dataset for Symbol Description Reading?0
Solving Math Word Problems by Combining Language Models With Symbolic SolversCode1
Gamifying Math Education using Object Detection0
AGIEval: A Human-Centric Benchmark for Evaluating Foundation ModelsCode2
Reinforcement Learning Tutor Better Supported Lower Performers in a Math Task0
From Zero to Hero: Convincing with Extremely Complicated MathCode1
Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases0
Reliable and Efficient Evaluation of Adversarial Robustness for Deep Hashing-Based Retrieval0
Mind meets machine: Unravelling GPT-4's cognitive psychology0
OntoMath^PRO 2.0 Ontology: Updates of the Formal Model0
How well do Large Language Models perform in Arithmetic tasks?Code1
GPT-4 Technical ReportCode6
SALSA PICANTE: a machine learning attack on LWE with binary secretsCode1
Self-reinforced polynomial approximation methods for concentrated probability densities0
MathPrompter: Mathematical Reasoning using Large Language ModelsCode1
Show:102550
← PrevPage 24 of 32Next →

No leaderboard results yet.