Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–450 of 1107 papers

Title	Date	Tasks	Status	Score
Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute Misconceptions	Oct 3, 2023	MisconceptionsMultiple-choice	CodeCode Available	5
KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting	Dec 1, 2024	Multiple-choiceMultiple Choice Question Answering (MCQA)	CodeCode Available	5
LLaVA-OneVision: Easy Visual Task Transfer	Aug 6, 2024	3D Question Answering (3D-QA)	CodeCode Available	5
Leveraging large language models for nano synthesis mechanism explanation: solid foundations or mere conjectures?	Jul 12, 2024	Logical ReasoningMultiple-choice	CodeCode Available	5
DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking	Sep 26, 2024	Distractor GenerationMultiple-choice	CodeCode Available	5
Can Model Uncertainty Function as a Proxy for Multiple-Choice Question Item Difficulty?	Jul 7, 2024	Multiple-choice	CodeCode Available	5
Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings	Jan 15, 2024	Knowledge Graph EmbeddingsKnowledge Graphs	CodeCode Available	5
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models	Oct 13, 2024	HallucinationHallucination Evaluation	CodeCode Available	5
Automating Turkish Educational Quiz Generation Using Large Language Models	Jun 5, 2024	Multiple-choice	CodeCode Available	5
Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs	Oct 15, 2024	Image DescriptionMultiple-choice	CodeCode Available	5
LEAVS: An LLM-based Labeler for Abdominal CT Supervision	Mar 17, 2025	AnatomyLarge Language Model	CodeCode Available	5
Differentiating Choices via Commonality for Multiple-Choice Question Answering	Aug 21, 2024	Multiple-choiceMultiple Choice Question Answering (MCQA)	CodeCode Available	5
A large language model-assisted education tool to provide feedback on open-ended responses	Jul 25, 2023	Language ModelingLanguage Modelling	CodeCode Available	5
Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers	Oct 15, 2024	Multiple-choice	CodeCode Available	5
Length Optimization in Conformal Prediction	Jun 27, 2024	Conformal PredictionLanguage Modeling	CodeCode Available	5
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor	Dec 8, 2024	MisconceptionsMultiple-choice	CodeCode Available	5
CASE: Commonsense-Augmented Score with an Expanded Answer Space	Nov 3, 2023	Multiple-choice	CodeCode Available	5
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering	Aug 28, 2018	AI2 Reasoning ChallengeARC	CodeCode Available	5
Learning to Reuse Distractors to support Multiple Choice Question Generation in Education	Oct 25, 2022	Multiple-choiceQuestion Generation	CodeCode Available	5
Language Models as Knowledge Bases for Visual Word Sense Disambiguation	Oct 3, 2023	Image CaptioningMultiple-choice	CodeCode Available	5
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available	5
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models	Oct 15, 2023	Multiple-choiceTriplet	CodeCode Available	5
Affordably Fine-tuned LLMs Provide Better Answers to Course-specific MCQs	Jan 10, 2025	Multiple-choice	CodeCode Available	5
Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models	Apr 11, 2024	Multiple-choiceReading Comprehension	CodeCode Available	5
Joint Learning of Sentence Embeddings for Relevance and Entailment	May 16, 2016	Decision MakingInformation Retrieval	CodeCode Available	5
DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Are Large Language Models Consistent over Value-laden Questions?	Jul 3, 2024	Multiple-choice	CodeCode Available	5
Probabilities of Chat LLMs Are Miscalibrated but Still Predict Correctness on Multiple-Choice Q&A	Feb 20, 2024	Language ModellingLarge Language Model	CodeCode Available	5
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning	Nov 13, 2023	Multiple-choice	CodeCode Available	5
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding	Jan 10, 2025	Automatic Speech RecognitionClassification	CodeCode Available	5
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation	Apr 9, 2025	Multiple-choice	CodeCode Available	5
HSI: Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language Models	Feb 9, 2025	Answer GenerationLanguage Modeling	CodeCode Available	5
IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models	Jun 18, 2024	ManagementMultiple-choice	CodeCode Available	5
Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning	Feb 8, 2025	Legal ReasoningMultiple-choice	CodeCode Available	5
iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers	May 25, 2024	Common Sense ReasoningMultiple-choice	CodeCode Available	5
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation	Jun 13, 2024	BenchmarkingHallucination	CodeCode Available	5
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions	May 6, 2024	Decision MakingMultiple-choice	CodeCode Available	5
StoryAnalogy: Deriving Story-level Analogies from Large Language Models to Unlock Analogical Understanding	Oct 19, 2023	Multiple-choiceNatural Language Understanding	CodeCode Available	5
Introducing a framework to assess newly created questions with Natural Language Processing	Apr 28, 2020	Multiple-choice	CodeCode Available	5
DE-COP: Detecting Copyrighted Content in Language Models Training Data	Feb 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
An Automatic Question Usability Evaluation Toolkit	May 30, 2024	Multiple-choiceWord Embeddings	CodeCode Available	5
Introducing Flexible Monotone Multiple Choice Item Response Theory Models and Bit Scales	Oct 2, 2024	Multiple-choice	CodeCode Available	5
Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?	Jul 2, 2024	Graph MiningLanguage Modeling	CodeCode Available	5
A Profit-Maximizing Strategy for Advertising on the e-Commerce Platforms	Oct 31, 2022	ManagementMultiple-choice	CodeCode Available	5
Fusing Models with Complementary Expertise	Oct 2, 2023	Multiple-choicetext-classification	CodeCode Available	5
TAXI: Evaluating Categorical Knowledge Editing for Language Models	Apr 23, 2024	knowledge editingMultiple-choice	CodeCode Available	5
Automated Generation and Tagging of Knowledge Components from Multiple-Choice Questions	May 30, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Chance-Constrained Multiple-Choice Knapsack Problem: Model, Algorithms, and Applications	Jun 26, 2023	Combinatorial OptimizationMultiple-choice	CodeCode Available	5
Improving Question Answering with External Knowledge	Feb 3, 2019	ARCMultiple-choice	CodeCode Available	5
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine	Nov 14, 2024	FormHallucination	CodeCode Available	5

Show:10 25 50

← PrevPage 9 of 23Next →

No leaderboard results yet.