| SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference | Aug 16, 2018 | Common Sense ReasoningMultiple-choice | —Unverified | 0 | 0 |
| SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages | Jun 20, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| TabMCQ: A Dataset of General Knowledge Tables and Multiple-choice Questions | Feb 12, 2016 | General KnowledgeMultiple-choice | —Unverified | 0 | 0 |
| TA-MAMC at SemEval-2021 Task 4: Task-adaptive Pretraining and Multi-head Attention for Abstract Meaning Reading Comprehension | Aug 1, 2021 | Contrastive LearningMultiple-choice | —Unverified | 0 | 0 |
| Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling | Sep 30, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine | May 29, 2025 | DiagnosticMultiple-choice | —Unverified | 0 | 0 |
| Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students' (Mis)Understanding Is Hinted | May 9, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Empowering Sentence Encoders with Prompting and Label Retrieval for Zero-shot Text Classification | Dec 20, 2022 | ClassificationDescriptive | —Unverified | 0 | 0 |
| Testing Uncertainty of Large Language Models for Physics Knowledge and Reasoning | Nov 18, 2024 | Logical ReasoningMultiple-choice | —Unverified | 0 | 0 |
| Answering Chinese Elementary School Social Studies Multiple Choice Questions | Dec 1, 2021 | Multiple-choice | —Unverified | 0 | 0 |