Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 676–700 of 1107 papers

Title	Date	Tasks	Status
Improved Few-Shot Image Classification Through Multiple-Choice Questions	Jul 23, 2024	ArticlesFew-Shot Image Classification	—Unverified
Improvement/Extension of Modular Systems as Combinatorial Reengineering (Survey)	Apr 17, 2013	Combinatorial OptimizationMultiple-choice	—Unverified
Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank	Apr 19, 2024	Distractor GenerationMath	—Unverified
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack	May 21, 2025	Multiple-choiceMultiple Choice Question Answering (MCQA)	—Unverified
Analysing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets	Sep 29, 2021	Language ModellingMachine Reading Comprehension	—Unverified
Improving the Production Efficiency and Well-formedness of Automatically-Generated Multiple-Choice Cloze Vocabulary Questions	May 1, 2020	Multiple-choice	—Unverified
In Case You Missed It: ARC 'Challenge' Is Not That Challenging	Dec 23, 2024	ARCMultiple-choice	—Unverified
TVBench: Redesigning Video-Language Evaluation	Oct 10, 2024	Multiple-choiceOpen-Ended Question Answering	—Unverified
Indirect Identification of Psychosocial Risks from Natural Language	Apr 30, 2020	Multiple-choiceTopic Models	—Unverified
Inferring from Logits: Exploring Best Practices for Decoding-Free Generative Candidate Selection	Jan 28, 2025	Multiple-choice	—Unverified
Two-Turn Debate Doesn't Help Humans Answer Hard Reading Comprehension Questions	Oct 19, 2022	Language ModelingLanguage Modelling	—Unverified
InnerThoughts: Disentangling Representations and Predictions in Large Language Models	Jan 29, 2025	Multiple-choicePosition	—Unverified
InstructionBench: An Instructional Video Understanding Benchmark	Apr 7, 2025	Common Sense ReasoningMultiple-choice	—Unverified
Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs	Jun 13, 2025	Medical Question AnsweringMedQA	—Unverified
Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh	Feb 19, 2025	Instruction FollowingMultiple-choice	—Unverified
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages	Dec 1, 2024	ARCMultiple-choice	—Unverified
Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex Healthcare Question Answering	Aug 6, 2020	Multiple-choiceQuestion Answering	—Unverified
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation	Jun 8, 2024	Abstractive Text SummarizationDialogue Generation	—Unverified
Investigating Data Contamination in Modern Benchmarks for Large Language Models	Nov 16, 2023	Common Sense ReasoningMMLU	—Unverified
Self-Assessment Tests are Unreliable Measures of LLM Personality	Sep 15, 2023	Multiple-choice	—Unverified
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination	Jun 10, 2023	MathMathematical Reasoning	—Unverified
Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting	Oct 18, 2023	Multiple-choice	—Unverified
WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts	Jun 18, 2025	document understandingMultiple-choice	—Unverified
ISAAQ -- Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention	Oct 1, 2020	Multiple-choiceQuestion Answering	—Unverified
ISAAQ - Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention	Nov 1, 2020	Multiple-choiceQuestion Answering	—Unverified

Show:10 25 50

← PrevPage 28 of 45Next →

No leaderboard results yet.