| QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks? | Mar 28, 2025 | Logical ReasoningMath | CodeCode Available | 1 | 5 |
| Are NLP Models really able to Solve Simple Math Word Problems? | Mar 12, 2021 | MathMath Word Problem Solving | CodeCode Available | 1 | 5 |
| Case-Based or Rule-Based: How Do Transformers Do the Math? | Feb 27, 2024 | MathSystematic Generalization | CodeCode Available | 1 | 5 |
| Problem-Oriented Segmentation and Retrieval: Case Study on Tutoring Conversations | Nov 12, 2024 | MathRetrieval | CodeCode Available | 1 | 5 |
| CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language Models | Dec 23, 2024 | Decision MakingMath | CodeCode Available | 1 | 5 |
| AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation | Apr 25, 2024 | Code GenerationMath | CodeCode Available | 1 | 5 |
| EXAONE Deep: Reasoning Enhanced Language Models | Mar 16, 2025 | Math | CodeCode Available | 1 | 5 |
| Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective | Jun 22, 2025 | In-Context LearningLarge Language Model | CodeCode Available | 1 | 5 |
| Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation | Apr 15, 2025 | MathQuantum Machine Learning | CodeCode Available | 1 | 5 |
| Explaining Datasets in Words: Statistical Models with Natural Language Parameters | Sep 13, 2024 | ClusteringLanguage Modeling | CodeCode Available | 1 | 5 |