Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–400 of 1107 papers

Title	Date	Tasks	Status	Score
MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models	Dec 10, 2024	Multiple-choiceQuestion Answering	CodeCode Available	5
MedG-KRP: Medical Graph Knowledge Representation Probing	Dec 14, 2024	Multiple-choiceMultiple Choice Question Answering (MCQA)	CodeCode Available	5
MedArabiQ: Benchmarking Large Language Models on Arabic Medical Tasks	May 6, 2025	BenchmarkingMultiple-choice	CodeCode Available	5
EMBRACE: Evaluation and Modifications for Boosting RACE	May 15, 2023	Machine Reading ComprehensionMultiple-choice	CodeCode Available	5
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback	Oct 17, 2024	Fact VerificationHallucination	CodeCode Available	5
Measuring Agreeableness Bias in Multimodal Models	Aug 17, 2024	Decision MakingMultiple-choice	CodeCode Available	5
ElimiNet: A Model for Eliminating Options for Reading Comprehension with Multiple Choice Questions	Apr 4, 2019	Multiple-choiceReading Comprehension	CodeCode Available	5
Eliciting Informative Text Evaluations with Large Language Models	May 23, 2024	Multiple-choicePrediction	CodeCode Available	5
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models	Dec 31, 2024	Multiple-choiceQuestion Answering	CodeCode Available	5
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models	Sep 19, 2024	EthicsMultiple-choice	CodeCode Available	5
A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation using GPT	Jan 13, 2024	Distractor GenerationMultiple-choice	CodeCode Available	5
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models	Oct 13, 2024	HallucinationHallucination Evaluation	CodeCode Available	5
Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think	Apr 12, 2024	Multiple-choice	CodeCode Available	5
Beyond English-Only Reading Comprehension: Experiments in Zero-Shot Multilingual Transfer for Bulgarian	Aug 5, 2019	Multiple-choicePhilosophy	CodeCode Available	5
EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants	Feb 27, 2025	Multiple-choice	CodeCode Available	5
DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors	May 29, 2025	MMLUMultiple-choice	CodeCode Available	5
LLaVA-OneVision: Easy Visual Task Transfer	Aug 6, 2024	3D Question Answering (3D-QA)	CodeCode Available	5
BERT-based distractor generation for Swedish reading comprehension questions using a small-scale dataset	Aug 9, 2021	Distractor GenerationMultiple-choice	CodeCode Available	5
Limited Ability of LLMs to Simulate Human Psychological Behaviours: a Psychometric Analysis	May 12, 2024	Multiple-choiceQuestion Answering	CodeCode Available	5
DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension	Feb 1, 2019	Dialogue UnderstandingMultiple-choice	CodeCode Available	5
BertaQA: How Much Do Language Models Know About Local Culture?	Jun 11, 2024	Multiple-choiceTransfer Learning	CodeCode Available	5
LiveQA: A Question Answering Dataset over Sports Live	Oct 1, 2020	Multiple-choiceQuestion Answering	CodeCode Available	5
LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMs	Jun 7, 2024	Mathematical ReasoningMultiple-choice	CodeCode Available	5
Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings	Jan 15, 2024	Knowledge Graph EmbeddingsKnowledge Graphs	CodeCode Available	5
HSI: Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language Models	Feb 9, 2025	Answer GenerationLanguage Modeling	CodeCode Available	5
Leveraging large language models for nano synthesis mechanism explanation: solid foundations or mere conjectures?	Jul 12, 2024	Logical ReasoningMultiple-choice	CodeCode Available	5
LEAVS: An LLM-based Labeler for Abdominal CT Supervision	Mar 17, 2025	AnatomyLarge Language Model	CodeCode Available	5
Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting	Mar 29, 2022	Multiple-choice	CodeCode Available	5
Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers	Oct 15, 2024	Multiple-choice	CodeCode Available	5
Length Optimization in Conformal Prediction	Jun 27, 2024	Conformal PredictionLanguage Modeling	CodeCode Available	5
Neural Natural Logic Inference for Interpretable Question Answering	Nov 1, 2021	Multiple-choiceNatural Language Inference	CodeCode Available	5
Does Multiple Choice Have a Future in the Age of Generative AI? A Posttest-only RCT	Dec 13, 2024	Multiple-choice	CodeCode Available	5
DMCL: Distillation Multiple Choice Learning for Multimodal Action Recognition	Dec 23, 2019	Action RecognitionMultiple-choice	CodeCode Available	5
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models	Oct 2, 2024	Multiple-choiceparameter-efficient fine-tuning	CodeCode Available	5
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions	Jun 27, 2024	Distractor GenerationMath	CodeCode Available	5
An Information-Theoretic Approach to Analyze NLP Classification Tasks	Feb 1, 2024	Multiple-choiceReading Comprehension	CodeCode Available	5
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering	Aug 28, 2018	AI2 Reasoning ChallengeARC	CodeCode Available	5
Every Answer Matters: Evaluating Commonsense with Probabilistic Measures	Jun 6, 2024	Common Sense ReasoningLanguage Modeling	CodeCode Available	5
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models	Oct 15, 2023	Multiple-choiceTriplet	CodeCode Available	5
Sentence Embeddings for Russian NLU	Oct 29, 2019	Multiple-choiceParaphrase Identification	CodeCode Available	5
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation	Apr 9, 2025	Multiple-choice	CodeCode Available	5
Distractor generation for multiple-choice questions with predictive prompting and large language models	Jul 30, 2023	Distractor GenerationMultiple-choice	CodeCode Available	5
Distractor Generation for Multiple Choice Questions Using Learning to Rank	Jun 1, 2018	BIG-bench Machine LearningDistractor Generation	CodeCode Available	5
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available	5
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning	Nov 13, 2023	Multiple-choice	CodeCode Available	5
DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking	Sep 26, 2024	Distractor GenerationMultiple-choice	CodeCode Available	5
Iterative Forward Tuning Boosts In-Context Learning in Language Models	May 22, 2023	Decision MakingIn-Context Learning	CodeCode Available	5
Joint Learning of Sentence Embeddings for Relevance and Entailment	May 16, 2016	Decision MakingInformation Retrieval	CodeCode Available	5
Language Models as Knowledge Bases for Visual Word Sense Disambiguation	Oct 3, 2023	Image CaptioningMultiple-choice	CodeCode Available	5
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor	Dec 8, 2024	MisconceptionsMultiple-choice	CodeCode Available	5

Show:10 25 50

← PrevPage 8 of 23Next →

No leaderboard results yet.