| How is ChatGPT's behavior changing over time? | Jul 18, 2023 | Code GenerationLanguage Modelling | CodeCode Available | 4 |
| A mixed policy to improve performance of language models on math problems | Jul 17, 2023 | GSM8KMath | CodeCode Available | 0 |
| Math Agents: Computational Infrastructure, Mathematical Embedding, and Genomics | Jul 4, 2023 | Automated Theorem ProvingMath | —Unverified | 0 |
| MWPRanker: An Expression Similarity Based Math Word Problem Retriever | Jul 3, 2023 | Logical SequenceMath | —Unverified | 0 |
| CMATH: Can Your Language Model Pass Chinese Elementary School Math Test? | Jun 29, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LeanDojo: Theorem Proving with Retrieval-Augmented Language Models | Jun 27, 2023 | Automated Theorem ProvingGPU | CodeCode Available | 2 |
| Let's Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning | Jun 25, 2023 | counterfactualMath | —Unverified | 0 |
| Math Word Problem Solving by Generating Linguistic Variants of Problem Statements | Jun 24, 2023 | DecoderIngenuity | CodeCode Available | 0 |
| A Survey on Multimodal Large Language Models | Jun 23, 2023 | HallucinationIn-Context Learning | —Unverified | 0 |
| Public Attitudes Toward ChatGPT on Twitter: Sentiments, Topics, and Occupations | Jun 22, 2023 | ChatbotLanguage Modelling | CodeCode Available | 0 |
| DiversiGATE: A Comprehensive Framework for Reliable Large Language Models | Jun 22, 2023 | Arithmetic ReasoningGSM8K | —Unverified | 0 |
| Learning by Analogy: Diverse Questions Generation in Math Word Problem | Jun 15, 2023 | Math | CodeCode Available | 0 |
| SIGHT: A Large Annotated Dataset on Student Insights Gathered from Higher Education Transcripts | Jun 15, 2023 | Math | CodeCode Available | 1 |
| A Neural Network Implementation for Free Energy Principle | Jun 11, 2023 | Math | —Unverified | 0 |
| Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination | Jun 10, 2023 | MathMathematical Reasoning | —Unverified | 0 |
| PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts | Jun 7, 2023 | Cross-Lingual Paraphrase IdentificationMachine Translation | —Unverified | 0 |
| World Models for Math Story Problems | Jun 7, 2023 | Math | CodeCode Available | 0 |
| Is ChatGPT a Good Teacher Coach? Measuring Zero-Shot Performance For Scoring and Providing Actionable Insights on Classroom Instruction | Jun 5, 2023 | Math | CodeCode Available | 1 |
| Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning | Jun 4, 2023 | Math | CodeCode Available | 1 |
| Does ChatGPT Comprehend the Place Value in Numbers When Solving Math Word Problems? | Jun 3, 2023 | MathMath Word Problem Solving | CodeCode Available | 0 |
| MathChat: Converse to Tackle Challenging Math Problems with LLM Agents | Jun 2, 2023 | Elementary MathematicsMath | CodeCode Available | 1 |
| Learning Multi-Step Reasoning by Solving Arithmetic Tasks | Jun 2, 2023 | MathMathematical Reasoning | CodeCode Available | 1 |
| AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration | Jun 1, 2023 | Autonomous DrivingCloud Computing | CodeCode Available | 6 |
| Inspecting Spoken Language Understanding from Kids for Basic Math Learning at Home | Jun 1, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions | Jun 1, 2023 | Math | —Unverified | 0 |