Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1051–1075 of 1107 papers

Title	Date	Tasks	Status
How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering?	Jun 19, 2025	Multiple-choiceQuestion Answering	—Unverified
How Many Workers to Ask? Adaptive Exploration for Collecting High Quality Labels	Nov 1, 2014	Multiple-choice	—Unverified
How Susceptible are LLMs to Influence in Prompts?	Aug 17, 2024	Multiple-choiceQuestion Answering	—Unverified
How well do LLMs reason over tabular data, really?	May 12, 2025	Missing ValuesMultiple-choice	—Unverified
HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method	Jun 1, 2022	Machine Reading ComprehensionMultiple-choice	—Unverified
Humanity's Last Exam	Jan 24, 2025	Humanity's Last ExamLanguage Modeling	—Unverified
Humans and Large Language Models in Clinical Decision Support: A Study with Medical Calculators	Nov 8, 2024	Decision MakingMultiple-choice	—Unverified
Hypothesis Testing for Quantifying LLM-Human Misalignment in Multiple Choice Settings	Jun 17, 2025	Decision MakingLanguage Modeling	—Unverified
Identification of mental fatigue in language comprehension tasks based on EEG and deep learning	Apr 14, 2021	ClassificationEEG	—Unverified
Treatment Effects with Multidimensional Unobserved Heterogeneity: Identification of the Marginal Treatment Effect	Sep 23, 2022	Multiple-choice	—Unverified
Identifying Multiple Personalities in Large Language Models with External Evaluation	Feb 22, 2024	Multiple-choice	—Unverified
Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words	Mar 10, 2025	Multiple-choice	—Unverified
IIE-NLP-Eyas at SemEval-2021 Task 4: Enhancing PLM for ReCAM with Special Tokens, Re-Ranking, Siamese Encoders and Back Translation	Feb 25, 2021	Multiple-choiceQuestion Answering	—Unverified
IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template Reconstruction Strategy for ComVE	Jul 2, 2020	Multiple-choiceQuestion Answering	—Unverified
IllusionBench: A Large-scale and Comprehensive Benchmark for Visual Illusion Understanding in Vision-Language Models	Jan 1, 2025	HallucinationMultiple-choice	—Unverified
Image Aesthetic Reasoning: A New Benchmark for Medical Image Screening with MLLMs	May 29, 2025	Image GenerationMultiple-choice	—Unverified
Imagery as Inquiry: Exploring A Multimodal Dataset for Conversational Recommendation	May 23, 2024	Conversational RecommendationMultiple-choice	—Unverified
Improved Few-Shot Image Classification Through Multiple-Choice Questions	Jul 23, 2024	ArticlesFew-Shot Image Classification	—Unverified
Improvement/Extension of Modular Systems as Combinatorial Reengineering (Survey)	Apr 17, 2013	Combinatorial OptimizationMultiple-choice	—Unverified
Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank	Apr 19, 2024	Distractor GenerationMath	—Unverified
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack	May 21, 2025	Multiple-choiceMultiple Choice Question Answering (MCQA)	—Unverified
Analysing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets	Sep 29, 2021	Language ModellingMachine Reading Comprehension	—Unverified
Improving the Production Efficiency and Well-formedness of Automatically-Generated Multiple-Choice Cloze Vocabulary Questions	May 1, 2020	Multiple-choice	—Unverified
In Case You Missed It: ARC 'Challenge' Is Not That Challenging	Dec 23, 2024	ARCMultiple-choice	—Unverified
TVBench: Redesigning Video-Language Evaluation	Oct 10, 2024	Multiple-choiceOpen-Ended Question Answering	—Unverified

Show:10 25 50

← PrevPage 43 of 45Next →

No leaderboard results yet.