| Document-level Event Factuality Identification via Machine Reading Comprehension Frameworks with Transfer Learning | Oct 1, 2022 | Data AugmentationMachine Reading Comprehension | —Unverified | 0 | 0 |
| Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla | Jul 18, 2023 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| Do Fine-tuned Commonsense Language Models Really Generalize? | Nov 18, 2020 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales | Jun 4, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| Do LLMs Act as Repositories of Causal Knowledge? | Dec 14, 2024 | Causal InferenceMultiple-choice | —Unverified | 0 | 0 |
| Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models | Jul 23, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns | Feb 21, 2025 | Distractor GenerationMultiple-choice | —Unverified | 0 | 0 |
| Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts | Jun 8, 2024 | Machine TranslationMultiple-choice | —Unverified | 0 | 0 |
| DP-SSL: Towards Robust Semi-supervised Learning with A Few Labeled Samples | Oct 26, 2021 | Multiple-choiceSemi-Supervised Image Classification | —Unverified | 0 | 0 |
| DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension | Mar 1, 2019 | Dialogue UnderstandingMultiple-choice | —Unverified | 0 | 0 |