| AI4Math: A Native Spanish Benchmark for University-Level Mathematical Reasoning in Large Language Models | May 25, 2025 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Aligning Tutor Discourse Supporting Rigorous Thinking with Tutee Content Mastery for Predicting Math Achievement | May 10, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Amplify Adjacent Token Differences: Enhancing Long Chain-of-Thought Reasoning with Shift-FFN | May 22, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Anomaly Detection of Tabular Data Using LLMs | Jun 24, 2024 | Anomaly DetectionLong-Context Understanding | —Unverified | 0 | 0 |
| Applications of Positive Unlabeled (PU) and Negative Unlabeled (NU) Learning in Cybersecurity | Dec 9, 2024 | Intrusion DetectionMalware Detection | —Unverified | 0 | 0 |
| Applying RLAIF for Code Generation with API-usage in Lightweight LLMs | Jun 28, 2024 | Code GenerationHallucination | —Unverified | 0 | 0 |
| Apriori Knowledge in an Era of Computational Opacity: The Role of AI in Mathematical Discovery | Mar 15, 2024 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations? | May 15, 2025 | Mathematical Reasoning | —Unverified | 0 | 0 |
| Assessing GPT4-V on Structured Reasoning Tasks | Dec 13, 2023 | Code GenerationLanguage Modeling | —Unverified | 0 | 0 |
| Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering | Feb 17, 2024 | Arithmetic ReasoningMathematical Reasoning | —Unverified | 0 | 0 |