Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 951–1000 of 1107 papers

Title	Date	Tasks	Status
AILS-NTUA at SemEval-2024 Task 9: Cracking Brain Teasers: Transformer Models for Lateral Thinking Puzzles	Apr 1, 2024	Common Sense ReasoningMultiple-choice	CodeCode Available
DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors	May 29, 2025	MMLUMultiple-choice	CodeCode Available
EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants	Feb 27, 2025	Multiple-choice	CodeCode Available
MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension	Oct 1, 2019	Logical ReasoningMachine Reading Comprehension	CodeCode Available
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions	May 6, 2024	Decision MakingMultiple-choice	CodeCode Available
MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models	Dec 10, 2024	Multiple-choiceQuestion Answering	CodeCode Available
Pragmatic Competence Evaluation of Large Language Models for the Korean Language	Mar 19, 2024	Few-Shot LearningMultiple-choice	CodeCode Available
Which is the Effective Way for Gaokao: Information Retrieval or Neural Networks?	Apr 1, 2017	Information RetrievalMultiple-choice	CodeCode Available
Edu-Values: Towards Evaluating the Chinese Education Values of Large Language Models	Sep 19, 2024	EthicsMultiple-choice	CodeCode Available
Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning	Feb 8, 2025	Legal ReasoningMultiple-choice	CodeCode Available
Precise Task Formalization Matters in Winograd Schema Evaluations	Oct 8, 2020	Language ModelingLanguage Modelling	CodeCode Available
Towards a Unified Multimodal Reasoning Framework	Dec 22, 2023	Multimodal ReasoningMultiple-choice	CodeCode Available
IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models	Jun 18, 2024	ManagementMultiple-choice	CodeCode Available
iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers	May 25, 2024	Common Sense ReasoningMultiple-choice	CodeCode Available
Eliciting Informative Text Evaluations with Large Language Models	May 23, 2024	Multiple-choicePrediction	CodeCode Available
ElimiNet: A Model for Eliminating Options for Reading Comprehension with Multiple Choice Questions	Apr 4, 2019	Multiple-choiceReading Comprehension	CodeCode Available
Self-Recognition in Language Models	Jul 9, 2024	Multiple-choice	CodeCode Available
EMBRACE: Evaluation and Modifications for Boosting RACE	May 15, 2023	Machine Reading ComprehensionMultiple-choice	CodeCode Available
Can multiple-choice questions really be useful in detecting the abilities of LLMs?	Mar 26, 2024	Multiple-choiceQuestion Answering	CodeCode Available
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment	Jul 20, 2024	Contrastive LearningMultiple-choice	CodeCode Available
Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy	May 24, 2023	In-Context LearningMultiple-choice	CodeCode Available
Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?	Jul 2, 2024	Graph MiningLanguage Modeling	CodeCode Available
Iterative Forward Tuning Boosts In-Context Learning in Language Models	May 22, 2023	Decision MakingIn-Context Learning	CodeCode Available
Can We Guide a Multi-Hop Reasoning Language Model to Incrementally Learn at Each Single-Hop?	Oct 1, 2022	Language ModelingLanguage Modelling	CodeCode Available
BnMMLU: Measuring Massive Multitask Language Understanding in Bengali	May 25, 2025	General KnowledgeMMLU	CodeCode Available
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning	Nov 13, 2023	Multiple-choice	CodeCode Available
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension	Apr 21, 2019	Data AugmentationLanguage Modelling	CodeCode Available
Joint Learning of Sentence Embeddings for Relevance and Entailment	May 16, 2016	Decision MakingInformation Retrieval	CodeCode Available
Enhancing textual textbook question answering with large language models and retrieval augmented generation	Feb 5, 2024	Multiple-choiceQuestion Answering	CodeCode Available
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation	Apr 9, 2025	Multiple-choice	CodeCode Available
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models	Oct 15, 2023	Multiple-choiceTriplet	CodeCode Available
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval	Oct 3, 2023	ArticlesDecision Making	CodeCode Available
Moving Beyond Medical Exam Questions: A Clinician-Annotated Dataset of Real-World Tasks and Ambiguity in Mental Healthcare	Feb 22, 2025	Decision MakingMultiple-choice	CodeCode Available
Uncertainty quantification in fine-tuned LLMs using LoRA ensembles	Feb 19, 2024	Multiple-choiceUncertainty Quantification	CodeCode Available
Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings	Dec 9, 2024	Multiple-choice	CodeCode Available
Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam	Jun 14, 2024	FairnessLogical Reasoning	CodeCode Available
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models	May 13, 2025	FormMultiple-choice	CodeCode Available
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach	Sep 9, 2024	Computational EfficiencyContinual Pretraining	CodeCode Available
Evaluating Large Language Model Biases in Persona-Steered Generation	May 30, 2024	Language ModelingLanguage Modelling	CodeCode Available
SeqSAM: Autoregressive Multiple Hypothesis Prediction for Medical Image Segmentation using SAM	Mar 12, 2025	Image SegmentationMedical Image Segmentation	CodeCode Available
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering	Jun 6, 2024	abstractive question answeringClinical Knowledge	CodeCode Available
Order-Independence Without Fine Tuning	Jun 4, 2024	Language ModellingMultiple-choice	CodeCode Available
Towards Diverse Perspective Learning with Selection over Multiple Temporal Poolings	Mar 14, 2024	Multiple-choiceTime Series	CodeCode Available
PROST: Physical Reasoning of Objects through Space and Time	Jun 7, 2021	Multiple-choice	CodeCode Available
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence	Apr 3, 2025	Multiple-choice	CodeCode Available
Evaluating Prompts Across Multiple Choice Tasks In a Zero-Shot Setting	Mar 29, 2022	Multiple-choice	CodeCode Available
This Land is Your, My Land: Evaluating Geopolitical Biases in Language Models	May 24, 2023	Language ModellingLarge Language Model	CodeCode Available
Evaluating the Instruction-following Abilities of Language Models using Knowledge Tasks	Oct 16, 2024	Instruction FollowingMultiple-choice	CodeCode Available
Multi-class Hierarchical Question Classification for Multiple Choice Science Exams	Aug 15, 2019	ClassificationGeneral Classification	CodeCode Available

Show:10 25 50

← PrevPage 20 of 23Next →

No leaderboard results yet.