| HATS: Hindi Analogy Test Set for Evaluating Reasoning in Large Language Models | Jul 17, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| Combinatorial framework for planning in geological exploration | Jan 22, 2018 | AttributeMultiple-choice | —Unverified | 0 | 0 |
| Assessing Distractors in Multiple-Choice Tests | Nov 8, 2023 | DiversityMultiple-choice | —Unverified | 0 | 0 |
| HashEvict: A Pre-Attention KV Cache Eviction Strategy using Locality-Sensitive Hashing | Dec 13, 2024 | GPUMultiple-choice | —Unverified | 0 | 0 |
| HardML: A Benchmark For Evaluating Data Science And Machine Learning knowledge and reasoning in AI | Jan 26, 2025 | MMLUMultiple-choice | —Unverified | 0 | 0 |
| Assessing AI-Generated Questions' Alignment with Cognitive Frameworks in Educational Assessment | Apr 19, 2025 | ClassificationMultiple-choice | —Unverified | 0 | 0 |
| An AI-based Solution for Enhancing Delivery of Digital Learning for Future Teachers | Nov 9, 2021 | Multiple-choiceQuestion Generation | —Unverified | 0 | 0 |
| Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models | Oct 18, 2024 | FairnessMultiple-choice | —Unverified | 0 | 0 |
| HANS, are you clever? Clever Hans Effect Analysis of Neural Systems | Sep 21, 2023 | Decision MakingMultiple-choice | —Unverified | 0 | 0 |
| Hanfu-Bench: A Multimodal Benchmark on Cross-Temporal Cultural Understanding and Transcreation | Jun 2, 2025 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |