| Detect, Describe, Discriminate: Moving Beyond VQA for MLLM Evaluation | Sep 23, 2024 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| Developing A Framework to Support Human Evaluation of Bias in Generated Free Response Text | May 5, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| Development and Evaluation of a Personalized Computer-aided Question Generation for English Learners to Improve Proficiency and Correct Mistakes | Aug 29, 2018 | Multiple-choiceQuestion Generation | —Unverified | 0 | 0 |
| DFIR-Metric: A Benchmark Dataset for Evaluating Large Language Models in Digital Forensics and Incident Response | May 26, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Model | Apr 18, 2025 | Distractor GenerationMultiple-choice | —Unverified | 0 | 0 |
| DGRC: An Effective Fine-tuning Framework for Distractor Generation in Chinese Multi-choice Reading Comprehension | May 29, 2024 | Distractor GenerationMultiple-choice | —Unverified | 0 | 0 |
| Instructions and Guide for Diagnostic Questions: The NeurIPS 2020 Education Challenge | Jul 23, 2020 | DiagnosticMisconceptions | —Unverified | 0 | 0 |
| Dialogue-Based Simulation For Cultural Awareness Training | Feb 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Dienstplanerstellung in Krankenhaeusern mittels genetischer Algorithmen | May 30, 2013 | Multiple-choice | —Unverified | 0 | 0 |
| Differentiable Open-Ended Commonsense Reasoning | Oct 24, 2020 | Multiple-choice | —Unverified | 0 | 0 |
| Plug-in, Trainable Gate for Streamlining Arbitrary Neural Networks | Apr 24, 2019 | Efficient Neural Networkimage-classification | —Unverified | 0 | 0 |
| Different Questions, Different Models: Fine-Grained Evaluation of Uncertainty and Calibration in Clinical QA with LLMs | Jun 12, 2025 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities | Feb 20, 2024 | Multiple-choiceText Simplification | —Unverified | 0 | 0 |
| Disaggregating Hops: Can We Guide a Multi-Hop Reasoning Language Model to Incrementally Learn at each Hop? | Jan 16, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| DISTO: Evaluating Textual Distractors for Multi-Choice Questions using Negative Sampling based Approach | Apr 10, 2023 | Distractor GenerationMachine Translation | —Unverified | 0 | 0 |
| Distractor Analysis and Selection for Multiple-Choice Cloze Questions for Second-Language Learners | Jul 1, 2020 | Multiple-choice | —Unverified | 0 | 0 |
| Distractor Generation in Multiple-Choice Tasks: A Survey of Methods, Datasets, and Evaluation | Feb 2, 2024 | Distractor GenerationMultiple-choice | —Unverified | 0 | 0 |
| Distributional semantics beyond words: Supervised learning of analogy and paraphrase | Oct 18, 2013 | Multiple-choiceTask 2 | —Unverified | 0 | 0 |
| DiverseNet: When One Right Answer is not Enough | Aug 24, 2020 | Multiple-choiceStructured Prediction | —Unverified | 0 | 0 |
| DMind Benchmark: Toward a Holistic Assessment of LLM Capabilities across the Web3 Domain | Apr 18, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| Document-level Event Factuality Identification via Machine Reading Comprehension Frameworks with Transfer Learning | Oct 1, 2022 | Data AugmentationMachine Reading Comprehension | —Unverified | 0 | 0 |
| Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla | Jul 18, 2023 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| Do Fine-tuned Commonsense Language Models Really Generalize? | Nov 18, 2020 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales | Jun 4, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| Do LLMs Act as Repositories of Causal Knowledge? | Dec 14, 2024 | Causal InferenceMultiple-choice | —Unverified | 0 | 0 |
| Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models | Jul 23, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns | Feb 21, 2025 | Distractor GenerationMultiple-choice | —Unverified | 0 | 0 |
| Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts | Jun 8, 2024 | Machine TranslationMultiple-choice | —Unverified | 0 | 0 |
| DP-SSL: Towards Robust Semi-supervised Learning with A Few Labeled Samples | Oct 26, 2021 | Multiple-choiceSemi-Supervised Image Classification | —Unverified | 0 | 0 |
| DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension | Mar 1, 2019 | Dialogue UnderstandingMultiple-choice | —Unverified | 0 | 0 |
| DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests | Jan 8, 2025 | Multimodal ReasoningMultiple-choice | —Unverified | 0 | 0 |
| DsMCL: Dual-Level Stochastic Multiple Choice Learning for Multi-Modal Trajectory Prediction | Mar 19, 2020 | Multiple-choicePrediction | —Unverified | 0 | 0 |
| Dual Co-Matching Network for Multi-choice Reading Comprehension | Jan 27, 2019 | Machine Reading ComprehensionMultiple-choice | —Unverified | 0 | 0 |
| E-cheating Prevention Measures: Detection of Cheating at Online Examinations Using Deep Learning Approach -- A Case Study | Jan 25, 2021 | Multiple-choice | —Unverified | 0 | 0 |
| E-Commerce Promotions Personalization via Online Multiple-Choice Knapsack with Uplift Modeling | Aug 11, 2021 | Multiple-choice | —Unverified | 0 | 0 |
| Edinburgh Clinical NLP at MEDIQA-CORR 2024: Guiding Large Language Models with Hints | May 28, 2024 | Multiple-choiceSentence | —Unverified | 0 | 0 |
| Towards a Personal Health Large Language Model | Jun 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights | Sep 19, 2024 | Decision MakingKnowledge Distillation | —Unverified | 0 | 0 |
| Eigen Values Features for the Classification of Brain Signals corresponding to 2D and 3D Educational Contents | Apr 30, 2019 | General ClassificationMultiple-choice | —Unverified | 0 | 0 |
| Eliciting Categorical Data for Optimal Aggregation | Dec 1, 2016 | Multiple-choice | —Unverified | 0 | 0 |
| ELiRF-UPV at SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge | Jun 1, 2018 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework | Jan 16, 2025 | Multiple-choiceQuestion Generation | —Unverified | 0 | 0 |
| End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering | Oct 10, 2016 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Enhancing Distractor Generation for Multiple-Choice Questions with Retrieval Augmented Pretraining and Knowledge Graph Integration | Jun 19, 2024 | BenchmarkingDistractor Generation | —Unverified | 0 | 0 |
| Enhancing Event Causality Identification with Rationale and Structure-Aware Causal Question Answering | Mar 17, 2024 | Event Causality IdentificationMultiple-choice | —Unverified | 0 | 0 |
| Towards Collective Superintelligence: Amplifying Group IQ using Conversational Swarms | Jan 25, 2024 | Decision MakingMultiple-choice | —Unverified | 0 | 0 |
| Towards combinatorial clustering: preliminary research survey | May 28, 2015 | ClusteringCombinatorial Optimization | —Unverified | 0 | 0 |
| Enhancing LLM Evaluations: The Garbling Trick | Nov 3, 2024 | Multiple-choice | —Unverified | 0 | 0 |
| Enhancing LLMs' Reasoning-Intensive Multimedia Search Capabilities through Fine-Tuning and Reinforcement Learning | May 24, 2025 | Multiple-choicePrompt Engineering | —Unverified | 0 | 0 |
| Enhancing Multiple-choice Machine Reading Comprehension by Punishing Illogical Interpretations | Nov 1, 2021 | AttributeMachine Reading Comprehension | —Unverified | 0 | 0 |