| How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering? | Jun 19, 2025 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| How Many Workers to Ask? Adaptive Exploration for Collecting High Quality Labels | Nov 1, 2014 | Multiple-choice | —Unverified | 0 | 0 |
| How Susceptible are LLMs to Influence in Prompts? | Aug 17, 2024 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| How well do LLMs reason over tabular data, really? | May 12, 2025 | Missing ValuesMultiple-choice | —Unverified | 0 | 0 |
| HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method | Jun 1, 2022 | Machine Reading ComprehensionMultiple-choice | —Unverified | 0 | 0 |
| Humanity's Last Exam | Jan 24, 2025 | Humanity's Last ExamLanguage Modeling | —Unverified | 0 | 0 |
| Humans and Large Language Models in Clinical Decision Support: A Study with Medical Calculators | Nov 8, 2024 | Decision MakingMultiple-choice | —Unverified | 0 | 0 |
| Hypothesis Testing for Quantifying LLM-Human Misalignment in Multiple Choice Settings | Jun 17, 2025 | Decision MakingLanguage Modeling | —Unverified | 0 | 0 |
| Identification of mental fatigue in language comprehension tasks based on EEG and deep learning | Apr 14, 2021 | ClassificationEEG | —Unverified | 0 | 0 |
| Treatment Effects with Multidimensional Unobserved Heterogeneity: Identification of the Marginal Treatment Effect | Sep 23, 2022 | Multiple-choice | —Unverified | 0 | 0 |