| Applying RLAIF for Code Generation with API-usage in Lightweight LLMs | Jun 28, 2024 | Code GenerationHallucination | —Unverified | 0 | 0 |
| Apriori Knowledge in an Era of Computational Opacity: The Role of AI in Mathematical Discovery | Mar 15, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations? | May 15, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Assessing GPT4-V on Structured Reasoning Tasks | Dec 13, 2023 | Code GenerationLanguage Modeling | —Unverified | 0 | 0 |
| Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering | Feb 17, 2024 | Arithmetic ReasoningMathematical Reasoning | —Unverified | 0 | 0 |
| Assessing Robustness to Spurious Correlations in Post-Training Language Models | May 9, 2025 | Instruction FollowingMathematical Reasoning | —Unverified | 0 | 0 |
| Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models | Jun 5, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities | Dec 22, 2023 | ChatbotGSM8K | —Unverified | 0 | 0 |
| A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges | Dec 16, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| A Survey on Large Language Models for Mathematical Reasoning | Jun 10, 2025 | Answer GenerationMathematical Reasoning | —Unverified | 0 | 0 |
| A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers | May 21, 2023 | Mathematical Reasoning | —Unverified | 0 | 0 |
| A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks | May 16, 2024 | Code GenerationDialogue Generation | —Unverified | 0 | 0 |
| A Systematic Survey on Large Language Models for Algorithm Design | Oct 11, 2024 | Mathematical Reasoningscientific discovery | —Unverified | 0 | 0 |
| A Technical Study into Small Reasoning Language Models | Jun 16, 2025 | Code GenerationComputational Efficiency | —Unverified | 0 | 0 |
| Augmenting In-Context-Learning in LLMs via Automatic Data Labeling and Refinement | Oct 14, 2024 | In-Context LearningMathematical Reasoning | —Unverified | 0 | 0 |
| AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding | Aug 28, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning | May 29, 2025 | Geometry Problem SolvingMathematical Reasoning | —Unverified | 0 | 0 |
| AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database | May 19, 2025 | Data AugmentationIn-Context Learning | —Unverified | 0 | 0 |
| Forward-Backward Reasoning in Large Language Models for Mathematical Verification | Aug 15, 2023 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications | May 24, 2024 | Code GenerationLow-rank compression | —Unverified | 0 | 0 |
| Benchmarking Large Language Models via Random Variables | Jan 20, 2025 | BenchmarkingMathematical Reasoning | —Unverified | 0 | 0 |
| Benchmarking Large Language Models with Integer Sequence Generation Tasks | Nov 7, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 | 0 |
| Better Process Supervision with Bi-directional Rewarding Signals | Mar 6, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning | Jun 5, 2025 | Mathematical ReasoningProblem Decomposition | —Unverified | 0 | 0 |
| Beyond Captioning: Task-Specific Prompting for Improved VLM Performance in Mathematical Reasoning | Oct 8, 2024 | Image RetrievalMath | —Unverified | 0 | 0 |