SOTAVerified

Multiple Choice Question Answering (MCQA)

A multiple-choice question (MCQ) is composed of two parts: a stem that identifies the question or problem, and a set of alternatives or possible answers that contain a key that is the best answer to the question, and a number of distractors that are plausible but incorrect answers to the question.

In a k-way MCQA task, a model is provided with a question q, a set of candidate options O = {O1, . . . , Ok}, and a supporting context for each option C = {C1, . . . , Ck}. The model needs to predict the correct answer option that is best supported by the given contexts.

Papers

Showing 125 of 65 papers

TitleStatusHype
Llama 2: Open Foundation and Fine-Tuned Chat ModelsCode8
Training Compute-Optimal Large Language ModelsCode6
Galactica: A Large Language Model for ScienceCode4
MEDITRON-70B: Scaling Medical Pretraining for Large Language ModelsCode4
PaLM: Scaling Language Modeling with PathwaysCode2
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question AnsweringCode2
Scaling Language Models: Methods, Analysis & Insights from Training GopherCode2
IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian LanguagesCode1
M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language ModelsCode1
Fool Your (Vision and) Language Model With Embarrassingly Simple PermutationsCode1
Towards Expert-Level Medical Question Answering with Large Language ModelsCode1
Leveraging Large Language Models for Multiple Choice Question AnsweringCode1
Counterfactual Variable Control for Robust and Interpretable Question AnsweringCode1
Large Language Models Encode Clinical KnowledgeCode1
Variational Open-Domain Question AnsweringCode1
Can large language models reason about medical questions?Code1
QuALITY: Question Answering with Long Input Texts, Yes!Code1
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation ExpertsCode1
Clues Before Answers: Generation-Enhanced Multiple-Choice QACode1
LexGLUE: A Benchmark Dataset for Legal Language Understanding in EnglishCode1
BloombergGPT: A Large Language Model for FinanceCode0
FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domainCode0
BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicineCode0
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?Code0
Investigating the Shortcomings of LLMs in Step-by-Step Legal ReasoningCode0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.