| Graders should cheat: privileged information enables expert-level automated evaluations | Feb 16, 2025 | Math | —Unverified | 0 | 0 |
| Graph2Tac: Online Representation Learning of Formal Math Concepts | Jan 5, 2024 | AI AgentAutomated Theorem Proving | —Unverified | 0 | 0 |
| GRIN: GRadient-INformed MoE | Sep 18, 2024 | HellaSwagHumanEval | —Unverified | 0 | 0 |
| BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts | Oct 5, 2024 | Math | —Unverified | 0 | 0 |
| Blink of an eye: a simple theory for feature localization in generative models | Feb 2, 2025 | Math | —Unverified | 0 | 0 |
| GSSF: A Generative Sequence Similarity Function based on a Seq2Seq model for clustering online handwritten mathematical answers | May 21, 2021 | ClusteringDescriptive | —Unverified | 0 | 0 |
| Guideline Forest: Experience-Induced Multi-Guideline Reasoning with Stepwise Aggregation | Jun 9, 2025 | GSM8KHumanEval | —Unverified | 0 | 0 |
| Guiding Language Model Reasoning with Planning Tokens | Oct 9, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders | Oct 9, 2024 | Math | —Unverified | 0 | 0 |
| The Role of Diversity in In-Context Learning for Large Language Models | May 26, 2025 | DiversityIn-Context Learning | —Unverified | 0 | 0 |
| The Search-and-Mix Paradigm in Approximate Nash Equilibrium Algorithms | Oct 12, 2023 | Math | —Unverified | 0 | 0 |
| Hard Math -- Easy UVM: Pragmatic solutions for verifying hardware algorithms using UVM | Dec 6, 2024 | Math | —Unverified | 0 | 0 |
| The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding? | Feb 19, 2025 | Math | —Unverified | 0 | 0 |
| Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation | Oct 3, 2024 | GSM8KMath | —Unverified | 0 | 0 |
| Hawkeye:Efficient Reasoning with Model Collaboration | Apr 1, 2025 | Mathmodel | —Unverified | 0 | 0 |
| Heimdall: test-time scaling on the generative verification | Apr 14, 2025 | Math | —Unverified | 0 | 0 |
| HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks | Mar 6, 2025 | ChatbotLogical Reasoning | —Unverified | 0 | 0 |
| hep-th | Jun 27, 2018 | Binary ClassificationMath | —Unverified | 0 | 0 |
| Herald: A Natural Language Annotated Lean 4 Dataset | Oct 9, 2024 | MathMathematical Reasoning | —Unverified | 0 | 0 |
| Hierarchical Attention Decoder for Solving Math Word Problems | Nov 16, 2021 | DecoderMath | —Unverified | 0 | 0 |
| Hierarchical evolutive systems, fuzzy categories and the living single cell | Jan 31, 2018 | Math | —Unverified | 0 | 0 |
| WebMIaS on Docker: Deploying Math-Aware Search in a Single Line of Code | Jun 1, 2021 | MathRetrieval | —Unverified | 0 | 0 |
| Homeostatic Mechanisms in Biological Systems | Feb 22, 2022 | Math | —Unverified | 0 | 0 |
| Big Math and the One-Brain Barrier A Position Paper and Architecture Proposal | Apr 23, 2019 | MathPosition | —Unverified | 0 | 0 |
| How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study | Apr 1, 2025 | Code GenerationMath | —Unverified | 0 | 0 |