| Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy | May 24, 2023 | In-Context LearningMultiple-choice | CodeCode Available | 0 |
| Is Your Large Language Model Knowledgeable or a Choices-Only Cheater? | Jul 2, 2024 | Graph MiningLanguage Modeling | CodeCode Available | 0 |
| Iterative Forward Tuning Boosts In-Context Learning in Language Models | May 22, 2023 | Decision MakingIn-Context Learning | CodeCode Available | 0 |
| Can We Guide a Multi-Hop Reasoning Language Model to Incrementally Learn at Each Single-Hop? | Oct 1, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| BnMMLU: Measuring Massive Multitask Language Understanding in Bengali | May 25, 2025 | General KnowledgeMMLU | CodeCode Available | 0 |
| It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning | Nov 13, 2023 | Multiple-choice | CodeCode Available | 0 |
| Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension | Apr 21, 2019 | Data AugmentationLanguage Modelling | CodeCode Available | 0 |
| Joint Learning of Sentence Embeddings for Relevance and Entailment | May 16, 2016 | Decision MakingInformation Retrieval | CodeCode Available | 0 |
| Enhancing textual textbook question answering with large language models and retrieval augmented generation | Feb 5, 2024 | Multiple-choiceQuestion Answering | CodeCode Available | 0 |
| Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation | Apr 9, 2025 | Multiple-choice | CodeCode Available | 0 |