Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 976–1000 of 1107 papers

Title	Date	Tasks	Status
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning	Nov 13, 2023	Multiple-choice	CodeCode Available
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension	Apr 21, 2019	Data AugmentationLanguage Modelling	CodeCode Available
Joint Learning of Sentence Embeddings for Relevance and Entailment	May 16, 2016	Decision MakingInformation Retrieval	CodeCode Available
Enhancing textual textbook question answering with large language models and retrieval augmented generation	Feb 5, 2024	Multiple-choiceQuestion Answering	CodeCode Available
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation	Apr 9, 2025	Multiple-choice	CodeCode Available
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models	Oct 15, 2023	Multiple-choiceTriplet	CodeCode Available
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval	Oct 3, 2023	ArticlesDecision Making	CodeCode Available
Moving Beyond Medical Exam Questions: A Clinician-Annotated Dataset of Real-World Tasks and Ambiguity in Mental Healthcare	Feb 22, 2025	Decision MakingMultiple-choice	CodeCode Available
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles	Feb 19, 2024	Multiple-choiceUncertainty Quantification	CodeCode Available
Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings	Dec 9, 2024	Multiple-choice	CodeCode Available
Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam	Jun 14, 2024	FairnessLogical Reasoning	CodeCode Available
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models	May 13, 2025	FormMultiple-choice	CodeCode Available
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach	Sep 9, 2024	Computational EfficiencyContinual Pretraining	CodeCode Available
Evaluating Large Language Model Biases in Persona-Steered Generation	May 30, 2024	Language ModelingLanguage Modelling	CodeCode Available
SeqSAM: Autoregressive Multiple Hypothesis Prediction for Medical Image Segmentation using SAM	Mar 12, 2025	Image SegmentationMedical Image Segmentation	CodeCode Available
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering	Jun 6, 2024	abstractive question answeringClinical Knowledge	CodeCode Available
Order-Independence Without Fine Tuning	Jun 4, 2024	Language ModellingMultiple-choice	CodeCode Available
Towards Diverse Perspective Learning with Selection over Multiple Temporal Poolings	Mar 14, 2024	Multiple-choiceTime Series	CodeCode Available
PROST: Physical Reasoning of Objects through Space and Time	Jun 7, 2021	Multiple-choice	CodeCode Available
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence	Apr 3, 2025	Multiple-choice	CodeCode Available
Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting	Mar 29, 2022	Multiple-choice	CodeCode Available
This Land is Your, My Land: Evaluating Geopolitical Biases in Language Models	May 24, 2023	Language ModellingLarge Language Model	CodeCode Available
Evaluating the Instruction-following Abilities of Language Models using Knowledge Tasks	Oct 16, 2024	Instruction FollowingMultiple-choice	CodeCode Available
Multi-class Hierarchical Question Classification for Multiple Choice Science Exams	Aug 15, 2019	ClassificationGeneral Classification	CodeCode Available

Show:10 25 50

← PrevPage 40 of 45Next →

No leaderboard results yet.