| Generating multiple-choice questions for medical question answering with distractors and cue-masking | Mar 13, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MuLTI: Efficient Video-and-Language Understanding with Text-Guided MultiWay-Sampler and Multiple Choice Modeling | Mar 10, 2023 | Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION | —Unverified | 0 |
| Large Language Models (GPT) Struggle to Answer Multiple-Choice Questions about Code | Mar 9, 2023 | Multiple-choice | —Unverified | 0 |
| Long Horizon Temperature Scaling | Feb 7, 2023 | Multiple-choice | CodeCode Available | 1 |
| PADL: Language-Directed Physics-Based Character Control | Jan 31, 2023 | Image GenerationImitation Learning | CodeCode Available | 1 |
| BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models | Jan 30, 2023 | Generative Visual Question AnsweringImage Captioning | CodeCode Available | 4 |
| MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization | Jan 28, 2023 | HallucinationMultiple-choice | CodeCode Available | 2 |
| GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities | Jan 11, 2023 | Multiple-choice | CodeCode Available | 1 |
| Mind Reasoning Manners: Enhancing Type Perception for Generalized Zero-shot Logical Reasoning over Text | Jan 8, 2023 | Contrastive LearningLogical Reasoning | CodeCode Available | 1 |
| GPT Takes the Bar Exam | Dec 29, 2022 | Hyperparameter OptimizationMultiple-choice | CodeCode Available | 1 |
| Large Language Models Encode Clinical Knowledge | Dec 26, 2022 | Clinical KnowledgeMedQA | CodeCode Available | 1 |
| Empowering Sentence Encoders with Prompting and Label Retrieval for Zero-shot Text Classification | Dec 20, 2022 | ClassificationDescriptive | —Unverified | 0 |
| True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-4 | Dec 20, 2022 | Multiple-choice | —Unverified | 0 |
| Training Trajectories of Language Models Across Scales | Dec 19, 2022 | In-Context LearningMultiple-choice | CodeCode Available | 1 |
| Utilizing Background Knowledge for Robust Reasoning over Traffic Situations | Dec 4, 2022 | Knowledge GraphsMultiple-choice | CodeCode Available | 0 |
| Which Shortcut Solution Do Question Answering Models Prefer to Learn? | Nov 29, 2022 | Multiple-choiceQuestion Answering | CodeCode Available | 0 |
| Question-type Identification for Academic Questions in Online Learning Platform | Nov 24, 2022 | Binary ClassificationMultiple-choice | —Unverified | 0 |
| Evaluating the Knowledge Dependency of Questions | Nov 21, 2022 | Multiple-choice | CodeCode Available | 1 |
| Unified Question Answering in Slovene | Nov 16, 2022 | Cross-Lingual TransferDecoder | CodeCode Available | 0 |
| World Knowledge in Multiple Choice Reading Comprehension | Nov 13, 2022 | General KnowledgeMultiple-choice | CodeCode Available | 0 |
| A Profit-Maximizing Strategy for Advertising on the e-Commerce Platforms | Oct 31, 2022 | ManagementMultiple-choice | CodeCode Available | 0 |
| AGReE: A system for generating Automated Grammar Reading Exercises | Oct 28, 2022 | ArticlesMultiple-choice | —Unverified | 0 |
| Learning to Reuse Distractors to support Multiple Choice Question Generation in Education | Oct 25, 2022 | Multiple-choiceQuestion Generation | CodeCode Available | 0 |
| Cascading Biases: Investigating the Effect of Heuristic Annotation Strategies on Data and Models | Oct 24, 2022 | Multiple-choiceReading Comprehension | CodeCode Available | 0 |
| Leveraging Large Language Models for Multiple Choice Question Answering | Oct 22, 2022 | Answer SelectionMultiple-choice | CodeCode Available | 1 |
| AI-based Arabic Language and Speech Tutor | Oct 22, 2022 | Multiple-choiceSelf-Learning | —Unverified | 0 |
| Perception Test: A Diagnostic Benchmark for Multimodal Models | Oct 19, 2022 | DiagnosticMultiple-choice | CodeCode Available | 2 |
| Two-Turn Debate Doesn't Help Humans Answer Hard Reading Comprehension Questions | Oct 19, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective | Oct 16, 2022 | Coreference ResolutionMultiple-choice | CodeCode Available | 4 |
| Real-Time Automated Answer Scoring | Oct 13, 2022 | Multiple-choice | CodeCode Available | 0 |
| EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain | Oct 12, 2022 | Distractor GenerationMultiple-choice | CodeCode Available | 1 |
| Understanding Prior Bias and Choice Paralysis in Transformer-based Language Representation Models through Four Experimental Probes | Oct 3, 2022 | Decision MakingMultiple-choice | —Unverified | 0 |
| Document-level Event Factuality Identification via Machine Reading Comprehension Frameworks with Transfer Learning | Oct 1, 2022 | Data AugmentationMachine Reading Comprehension | —Unverified | 0 |
| A Weak Supervision Approach for Predicting Difficulty of Technical Interview Questions | Oct 1, 2022 | Multiple-choicePrediction | —Unverified | 0 |
| Can We Guide a Multi-Hop Reasoning Language Model to Incrementally Learn at Each Single-Hop? | Oct 1, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Using contradictions improves question answering systems | Sep 28, 2022 | Multiple-choiceNatural Language Inference | —Unverified | 0 |
| Variational Open-Domain Question Answering | Sep 23, 2022 | Language ModellingMedQA | CodeCode Available | 1 |
| Multiple-Choice Question Generation: Towards an Automated Assessment Framework | Sep 23, 2022 | DiversityMultiple-choice | —Unverified | 0 |
| Treatment Effects with Multidimensional Unobserved Heterogeneity: Identification of the Marginal Treatment Effect | Sep 23, 2022 | Multiple-choice | —Unverified | 0 |
| Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering | Sep 20, 2022 | Multimodal Deep LearningMultimodal Reasoning | CodeCode Available | 2 |
| Scheduling Algorithms for Federated Learning with Minimal Energy Consumption | Sep 13, 2022 | Federated LearningMultiple-choice | —Unverified | 0 |
| Zero-shot Event Causality Identification with Question Answering | Sep 1, 2022 | ArticlesEvent Causality Identification | —Unverified | 0 |
| Can large language models reason about medical questions? | Jul 17, 2022 | MedQAMultiple-choice | CodeCode Available | 1 |
| Language Models (Mostly) Know What They Know | Jul 11, 2022 | Multiple-choice | —Unverified | 0 |
| Exposing the Limits of Video-Text Models through Contrast Sets | Jul 1, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| CC-Riddle: A Question Answering Dataset of Chinese Character Riddles | Jun 28, 2022 | General KnowledgeLanguage Modelling | CodeCode Available | 1 |
| From Human Days to Machine Seconds: Automatically Answering and Generating Machine Learning Final Exams | Jun 11, 2022 | BIG-bench Machine LearningFew-Shot Learning | —Unverified | 0 |
| PADDLe: a Platform to Identify Complex Words for Learners of French as a Foreign Language (FFL) | Jun 1, 2022 | Multiple-choice | —Unverified | 0 |
| HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method | Jun 1, 2022 | Machine Reading ComprehensionMultiple-choice | —Unverified | 0 |
| CroaTPAS: A Survey-based Evaluation | Jun 1, 2022 | Multiple-choiceSurvey | —Unverified | 0 |