SOTAVerified

Math

Papers

Showing 12011250 of 1596 papers

TitleStatusHype
Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot PerformanceCode0
Benchmarking and Improving Generator-Validator Consistency of Language Models0
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions0
Fill in the Blank: Exploring and Enhancing LLM Capabilities for Backward Reasoning in Math Word ProblemsCode0
Investigating the Efficacy of Large Language Models in Reflective Assessment Methods through Chain of Thoughts Prompting0
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models0
Fairness Hub Technical Briefs: AUC Gap0
Contrastive Decoding Improves Reasoning in Large Language Models0
Odd period cycles and ergodic properties in price dynamics for an exchange economy0
ChatGPT-4 with Code Interpreter can be used to solve introductory college-level vector calculus and electromagnetism problems0
Using Large Language Model to Solve and Explain Physics Word Problems Approaching Human Level0
MathAttack: Attacking Large Language Models Towards Math Solving Ability0
Solving Math Word Problem with Problem Type ClassificationCode0
GraphReason: Enhancing Reasoning Capabilities of Large Language Models through A Graph-Based Verification Approach0
Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems0
NEOLAF, an LLM-powered neural-symbolic cognitive architecture0
Scalable and Equitable Math Problem Solving Strategy Prediction in Big Educational DataCode0
Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context LearningCode0
Reasoning in Large Language Models Through Symbolic Math Word ProblemsCode0
Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models0
Augmented Math: Authoring AR-Based Explorable Explanations by Augmenting Static Math TextbooksCode0
A large language model-assisted education tool to provide feedback on open-ended responsesCode0
ARB: Advanced Reasoning Benchmark for Large Language Models0
Explaining Math Word Problem Solvers0
Controlling Equational Reasoning in Large Language Models with Prompt Interventions0
A mixed policy to improve performance of language models on math problemsCode0
Math Agents: Computational Infrastructure, Mathematical Embedding, and Genomics0
MWPRanker: An Expression Similarity Based Math Word Problem Retriever0
CMATH: Can Your Language Model Pass Chinese Elementary School Math Test?0
Let's Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning0
Math Word Problem Solving by Generating Linguistic Variants of Problem StatementsCode0
A Survey on Multimodal Large Language Models0
Public Attitudes Toward ChatGPT on Twitter: Sentiments, Topics, and OccupationsCode0
DiversiGATE: A Comprehensive Framework for Reliable Large Language Models0
Learning by Analogy: Diverse Questions Generation in Math Word ProblemCode0
A Neural Network Implementation for Free Energy Principle0
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination0
PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts0
World Models for Math Story ProblemsCode0
Does ChatGPT Comprehend the Place Value in Numbers When Solving Math Word Problems?Code0
Interpretable Math Word Problem Solution Generation Via Step-by-step Planning0
Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions0
Inspecting Spoken Language Understanding from Kids for Basic Math Learning at Home0
Quantitative Methods for Optimizing Patient Outcomes in Liver Transplantation0
Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard0
Leveraging Training Data in Few-Shot Prompting for Numerical ReasoningCode0
Emergent inabilities? Inverse scaling over the course of pretraining0
Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition ExtractionCode0
RSRM: Reinforcement Symbolic Regression Machine0
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective0
Show:102550
← PrevPage 25 of 32Next →

No leaderboard results yet.