SOTAVerified

Multiple Choice Question Answering (MCQA)

A multiple-choice question (MCQ) is composed of two parts: a stem that identifies the question or problem, and a set of alternatives or possible answers that contain a key that is the best answer to the question, and a number of distractors that are plausible but incorrect answers to the question.

In a k-way MCQA task, a model is provided with a question q, a set of candidate options O = {O1, . . . , Ok}, and a supporting context for each option C = {C1, . . . , Ck}. The model needs to predict the correct answer option that is best supported by the given contexts.

Papers

Showing 125 of 65 papers

TitleStatusHype
Llama 2: Open Foundation and Fine-Tuned Chat ModelsCode8
Training Compute-Optimal Large Language ModelsCode6
Galactica: A Large Language Model for ScienceCode4
MEDITRON-70B: Scaling Medical Pretraining for Large Language ModelsCode4
Scaling Language Models: Methods, Analysis & Insights from Training GopherCode2
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question AnsweringCode2
PaLM: Scaling Language Modeling with PathwaysCode2
Counterfactual Variable Control for Robust and Interpretable Question AnsweringCode1
Large Language Models Encode Clinical KnowledgeCode1
IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian LanguagesCode1
QuALITY: Question Answering with Long Input Texts, Yes!Code1
Variational Open-Domain Question AnsweringCode1
Clues Before Answers: Generation-Enhanced Multiple-Choice QACode1
Fool Your (Vision and) Language Model With Embarrassingly Simple PermutationsCode1
Towards Expert-Level Medical Question Answering with Large Language ModelsCode1
Can large language models reason about medical questions?Code1
Leveraging Large Language Models for Multiple Choice Question AnsweringCode1
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation ExpertsCode1
LexGLUE: A Benchmark Dataset for Legal Language Understanding in EnglishCode1
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language ModelsCode1
CP-Router: An Uncertainty-Aware Router Between LLM and LRM0
KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations0
Context Modeling with Evidence Filter for Multiple Choice Question Answering0
Context-guided Triple Matching for Multiple Choice Question Answering0
Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.