Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–550 of 1107 papers

Title	Date	Tasks	Status
Humans and Large Language Models in Clinical Decision Support: A Study with Medical Calculators	Nov 8, 2024	Decision MakingMultiple-choice	—Unverified
ACPBench Hard: Unrestrained Reasoning about Action, Change, and Planning	Mar 31, 2025	Multiple-choice	—Unverified
DsMCL: Dual-Level Stochastic Multiple Choice Learning for Multi-Modal Trajectory Prediction	Mar 19, 2020	Multiple-choicePrediction	—Unverified
Identification of mental fatigue in language comprehension tasks based on EEG and deep learning	Apr 14, 2021	ClassificationEEG	—Unverified
Treatment Effects with Multidimensional Unobserved Heterogeneity: Identification of the Marginal Treatment Effect	Sep 23, 2022	Multiple-choice	—Unverified
Identifying Multiple Personalities in Large Language Models with External Evaluation	Feb 22, 2024	Multiple-choice	—Unverified
Contextual Response Interpretation for Automated Structured Interviews: A Case Study in Market Research	Apr 30, 2023	MarketingMultiple-choice	—Unverified
Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words	Mar 10, 2025	Multiple-choice	—Unverified
IIE-NLP-Eyas at SemEval-2021 Task 4: Enhancing PLM for ReCAM with Special Tokens, Re-Ranking, Siamese Encoders and Back Translation	Feb 25, 2021	Multiple-choiceQuestion Answering	—Unverified
IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template Reconstruction Strategy for ComVE	Jul 2, 2020	Multiple-choiceQuestion Answering	—Unverified
DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests	Jan 8, 2025	Multimodal ReasoningMultiple-choice	—Unverified
AGenT Zero: Zero-shot Automatic Multiple-Choice Question Generation for Skill Assessments	Nov 25, 2020	Multiple-choiceQuestion Generation	—Unverified
DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension	Mar 1, 2019	Dialogue UnderstandingMultiple-choice	—Unverified
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset	Nov 23, 2024	Language ModelingLanguage Modelling	—Unverified
DP-SSL: Towards Robust Semi-supervised Learning with A Few Labeled Samples	Oct 26, 2021	Multiple-choiceSemi-Supervised Image Classification	—Unverified
Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts	Jun 8, 2024	Machine TranslationMultiple-choice	—Unverified
Benchmarks for Pirá 2.0, a Reading Comprehension Dataset about the Ocean, the Brazilian Coast, and Climate Change	Sep 19, 2023	Generative Question AnsweringInformation Retrieval	—Unverified
Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns	Feb 21, 2025	Distractor GenerationMultiple-choice	—Unverified
Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models	Jul 23, 2024	Language ModellingLarge Language Model	—Unverified
Benchmarking Next-Generation Reasoning-Focused Large Language Models in Ophthalmology: A Head-to-Head Evaluation on 5,888 Items	Apr 15, 2025	BenchmarkingMultiple-choice	—Unverified
Do LLMs Act as Repositories of Causal Knowledge?	Dec 14, 2024	Causal InferenceMultiple-choice	—Unverified
Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales	Jun 4, 2025	Multiple-choice	—Unverified
Do Fine-tuned Commonsense Language Models Really Generalize?	Nov 18, 2020	Multiple-choiceQuestion Answering	—Unverified
An MRC Framework for Semantic Role Labeling	Jan 16, 2022	Computational EfficiencyMachine Reading Comprehension	—Unverified
Linguistic Legal Concept Extraction in Portuguese	Oct 22, 2018	EthicsMultiple-choice	—Unverified
LMVE at SemEval-2020 Task 4: Commonsense Validation and Explanation using Pretraining Language Model	Jul 6, 2020	Common Sense ReasoningLanguage Modeling	—Unverified
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla	Jul 18, 2023	Multiple-choiceQuestion Answering	—Unverified
Benchmarking Bias in Large Language Models during Role-Playing	Nov 1, 2024	BenchmarkingFairness	—Unverified
Document-level Event Factuality Identification via Machine Reading Comprehension Frameworks with Transfer Learning	Oct 1, 2022	Data AugmentationMachine Reading Comprehension	—Unverified
DMind Benchmark: Toward a Holistic Assessment of LLM Capabilities across the Web3 Domain	Apr 18, 2025	Multiple-choice	—Unverified
A Corpus of Text Data and Gaze Fixations from Autistic and Non-Autistic Adults	May 1, 2016	Multiple-choicePOS	—Unverified
Large Language Models Still Exhibit Bias in Long Text	Oct 23, 2024	FairnessMultiple-choice	—Unverified
DiverseNet: When One Right Answer is not Enough	Aug 24, 2020	Multiple-choiceStructured Prediction	—Unverified
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets	Apr 24, 2017	Multiple-choiceQuestion Answering	—Unverified
Learning a Word-Level Language Model with Sentence-Level Noise Contrastive Estimation for Contextual Sentence Probability Estimation	Mar 14, 2021	Language ModelingLanguage Modelling	—Unverified
Distributional semantics beyond words: Supervised learning of analogy and paraphrase	Oct 18, 2013	Multiple-choiceTask 2	—Unverified
Distractor Generation in Multiple-Choice Tasks: A Survey of Methods, Datasets, and Evaluation	Feb 2, 2024	Distractor GenerationMultiple-choice	—Unverified
Bayesian Statistical Modeling with Predictors from LLMs	Jun 13, 2024	Multiple-choice	—Unverified
A Weak Supervision Approach for Predicting Difficulty of Technical Interview Questions	Oct 1, 2022	Multiple-choicePrediction	—Unverified
Large Language Models (GPT) Struggle to Answer Multiple-Choice Questions about Code	Mar 9, 2023	Multiple-choice	—Unverified
Large Language Models Often Know When They Are Being Evaluated	May 28, 2025	MMLUMultiple-choice	—Unverified
Distractor Analysis and Selection for Multiple-Choice Cloze Questions for Second-Language Learners	Jul 1, 2020	Multiple-choice	—Unverified
DISTO: Evaluating Textual Distractors for Multi-Choice Questions using Negative Sampling based Approach	Apr 10, 2023	Distractor GenerationMachine Translation	—Unverified
Auxiliary Class Based Multiple Choice Learning	Aug 6, 2021	DiversityEnsemble Learning	—Unverified
Disaggregating Hops: Can We Guide a Multi-Hop Reasoning Language Model to Incrementally Learn at each Hop?	Jan 16, 2022	Language ModelingLanguage Modelling	—Unverified
An Improved Traditional Chinese Evaluation Suite for Foundation Model	Mar 4, 2024	Multiple-choiceQuestion Answering	—Unverified
A Foundational Multimodal Vision Language AI Assistant for Human Pathology	Dec 13, 2023	Decision MakingDiagnostic	—Unverified
Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions	Aug 22, 2023	Multiple-choiceSensitivity	—Unverified
Learning Language-Visual Embedding for Movie Understanding with Natural-Language	Sep 26, 2016	Multiple-choiceRetrieval	—Unverified
Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities	Feb 20, 2024	Multiple-choiceText Simplification	—Unverified

Show:10 25 50

← PrevPage 11 of 23Next →

No leaderboard results yet.