| Can large language models reason about medical questions? | Jul 17, 2022 | MedQAMultiple-choice | CodeCode Available | 1 |
| CC-Riddle: A Question Answering Dataset of Chinese Character Riddles | Jun 28, 2022 | General KnowledgeLanguage Modelling | CodeCode Available | 1 |
| SQuALITY: Building a Long-Document Summarization Dataset the Hard Way | May 23, 2022 | Document SummarizationMultiple-choice | CodeCode Available | 1 |
| FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue | May 12, 2022 | Dialogue UnderstandingDomain Adaptation | CodeCode Available | 1 |
| Clues Before Answers: Generation-Enhanced Multiple-Choice QA | Apr 30, 2022 | DecoderMultiple-choice | CodeCode Available | 1 |
| AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension | Mar 16, 2022 | Logical ReasoningMachine Reading Comprehension | CodeCode Available | 1 |
| Leaf: Multiple-Choice Question Generation | Jan 22, 2022 | Multiple-choiceQuestion Answering | CodeCode Available | 1 |
| Bridging Video-text Retrieval with Multiple Choice Questions | Jan 13, 2022 | Action RecognitionLinear evaluation | CodeCode Available | 1 |
| Multiple Choice Questions based Multi-Interest Policy Learning for Conversational Recommendation | Dec 22, 2021 | AttributeConversational Recommendation | CodeCode Available | 1 |
| QuALITY: Question Answering with Long Input Texts, Yes! | Dec 16, 2021 | Multiple-choiceMultiple Choice Question Answering (MCQA) | CodeCode Available | 1 |