SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1081–1090 of 1107 papers

Title	Date	Tasks	Status	Hype	Score
Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs	Jun 13, 2025	Medical Question AnsweringMedQA	—Unverified	0	0
Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh	Feb 19, 2025	Instruction FollowingMultiple-choice	—Unverified	0	0
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages	Dec 1, 2024	ARCMultiple-choice	—Unverified	0	0
Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex Healthcare Question Answering	Aug 6, 2020	Multiple-choiceQuestion Answering	—Unverified	0	0
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation	Jun 8, 2024	Abstractive Text SummarizationDialogue Generation	—Unverified	0	0
Investigating Data Contamination in Modern Benchmarks for Large Language Models	Nov 16, 2023	Common Sense ReasoningMMLU	—Unverified	0	0
Self-Assessment Tests are Unreliable Measures of LLM Personality	Sep 15, 2023	Multiple-choice	—Unverified	0	0
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination	Jun 10, 2023	MathMathematical Reasoning	—Unverified	0	0
Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting	Oct 18, 2023	Multiple-choice	—Unverified	0	0
WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts	Jun 18, 2025	document understandingMultiple-choice	—Unverified	0	0

Show:10 25 50

← PrevPage 109 of 111Next →

No leaderboard results yet.