| Large Language Models Encode Clinical Knowledge | Dec 26, 2022 | Clinical KnowledgeMedQA | CodeCode Available | 1 |
| Empowering Sentence Encoders with Prompting and Label Retrieval for Zero-shot Text Classification | Dec 20, 2022 | ClassificationDescriptive | —Unverified | 0 |
| True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-4 | Dec 20, 2022 | Multiple-choice | —Unverified | 0 |
| Training Trajectories of Language Models Across Scales | Dec 19, 2022 | In-Context LearningMultiple-choice | CodeCode Available | 1 |
| Utilizing Background Knowledge for Robust Reasoning over Traffic Situations | Dec 4, 2022 | Knowledge GraphsMultiple-choice | CodeCode Available | 0 |
| Which Shortcut Solution Do Question Answering Models Prefer to Learn? | Nov 29, 2022 | Multiple-choiceQuestion Answering | CodeCode Available | 0 |
| Question-type Identification for Academic Questions in Online Learning Platform | Nov 24, 2022 | Binary ClassificationMultiple-choice | —Unverified | 0 |
| Evaluating the Knowledge Dependency of Questions | Nov 21, 2022 | Multiple-choice | CodeCode Available | 1 |
| Unified Question Answering in Slovene | Nov 16, 2022 | Cross-Lingual TransferDecoder | CodeCode Available | 0 |
| World Knowledge in Multiple Choice Reading Comprehension | Nov 13, 2022 | General KnowledgeMultiple-choice | CodeCode Available | 0 |