| Answer-level Calibration for Free-form Multiple Choice Question Answering | May 1, 2022 | FormLanguage Modeling | CodeCode Available | 0 |
| Sentence Embeddings for Russian NLU | Oct 29, 2019 | Multiple-choiceParaphrase Identification | CodeCode Available | 0 |
| Language Models as Knowledge Bases for Visual Word Sense Disambiguation | Oct 3, 2023 | Image CaptioningMultiple-choice | CodeCode Available | 0 |
| Multimodal Residual Learning for Visual QA | Jun 5, 2016 | Multiple-choiceQuestion Answering | CodeCode Available | 0 |
| QASC: A Dataset for Question Answering via Sentence Composition | Oct 25, 2019 | Common Sense ReasoningMulti-hop Question Answering | CodeCode Available | 0 |
| VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation | Aug 15, 2017 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation | May 30, 2025 | Continual PretrainingFairness | CodeCode Available | 0 |
| Every Answer Matters: Evaluating Commonsense with Probabilistic Measures | Jun 6, 2024 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 0 |
| Evidence Sentence Extraction for Machine Reading Comprehension | Feb 23, 2019 | Machine Reading ComprehensionMultiple-choice | CodeCode Available | 0 |
| BertaQA: How Much Do Language Models Know About Local Culture? | Jun 11, 2024 | Multiple-choiceTransfer Learning | CodeCode Available | 0 |