SOTAVerified

Multiple Choice Question Answering (MCQA)

A multiple-choice question (MCQ) is composed of two parts: a stem that identifies the question or problem, and a set of alternatives or possible answers that contain a key that is the best answer to the question, and a number of distractors that are plausible but incorrect answers to the question.

In a k-way MCQA task, a model is provided with a question q, a set of candidate options O = {O1, . . . , Ok}, and a supporting context for each option C = {C1, . . . , Ck}. The model needs to predict the correct answer option that is best supported by the given contexts.

Papers

Showing 125 of 65 papers

TitleStatusHype
CP-Router: An Uncertainty-Aware Router Between LLM and LRM0
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack0
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information0
Question-Aware Knowledge Graph Prompting for Enhancing Large Language ModelsCode0
Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework0
Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning0
Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility ScoresCode0
Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above0
Investigating the Shortcomings of LLMs in Step-by-Step Legal ReasoningCode0
First Token Probability Guided RAG for Telecom Question Answering0
MedG-KRP: Medical Graph Knowledge Representation ProbingCode0
LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering0
KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced PromptingCode0
SandboxAQ's submission to MRL 2024 Shared Task on Multi-lingual Multi-task Information Retrieval0
Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models0
Differentiating Choices via Commonality for Multiple-Choice Question AnsweringCode0
Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions0
Long Story Short: Story-level Video Understanding from 20K Short Films0
EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential ReasoningCode0
AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation ExpertsCode1
From Multiple-Choice to Extractive QA: A Case Study for English and ArabicCode0
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension0
KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations0
Unsupervised multiple choices question answering via universal corpus0
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?Code0
Show:102550
← PrevPage 1 of 3Next →

No leaderboard results yet.