Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–675 of 1107 papers

Title	Date	Tasks	Status
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?	Jun 6, 2024	Multiple-choiceQuestion Answering	—Unverified
Analyzing the Performance of ChatGPT in Cardiology and Vascular Pathologies	Apr 15, 2023	Language ModelingLanguage Modelling	—Unverified
Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information	May 9, 2025	BenchmarkingForm	—Unverified
HFL-RC System at SemEval-2018 Task 11: Hybrid Multi-Aspects Model for Commonsense Reading Comprehension	Mar 15, 2018	Multiple-choiceReading Comprehension	—Unverified
Hierarchical Divide-and-Conquer for Fine-Grained Alignment in LLM-Based Medical Evaluation	Jan 12, 2025	AttributeMultiple-choice	—Unverified
HindiLLM: Large Language Model for Hindi	Dec 29, 2024	Language ModelingLanguage Modelling	—Unverified
Analyzing Multiple-Choice Reading and Listening Comprehension Tests	Jul 3, 2023	Multiple-choiceReading Comprehension	—Unverified
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites	Apr 25, 2024	4kLanguage Modeling	—Unverified
How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering?	Jun 19, 2025	Multiple-choiceQuestion Answering	—Unverified
How Many Workers to Ask? Adaptive Exploration for Collecting High Quality Labels	Nov 1, 2014	Multiple-choice	—Unverified
How Susceptible are LLMs to Influence in Prompts?	Aug 17, 2024	Multiple-choiceQuestion Answering	—Unverified
How well do LLMs reason over tabular data, really?	May 12, 2025	Missing ValuesMultiple-choice	—Unverified
HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method	Jun 1, 2022	Machine Reading ComprehensionMultiple-choice	—Unverified
Humanity's Last Exam	Jan 24, 2025	Humanity's Last ExamLanguage Modeling	—Unverified
Humans and Large Language Models in Clinical Decision Support: A Study with Medical Calculators	Nov 8, 2024	Decision MakingMultiple-choice	—Unverified
Hypothesis Testing for Quantifying LLM-Human Misalignment in Multiple Choice Settings	Jun 17, 2025	Decision MakingLanguage Modeling	—Unverified
Identification of mental fatigue in language comprehension tasks based on EEG and deep learning	Apr 14, 2021	ClassificationEEG	—Unverified
Treatment Effects with Multidimensional Unobserved Heterogeneity: Identification of the Marginal Treatment Effect	Sep 23, 2022	Multiple-choice	—Unverified
Identifying Multiple Personalities in Large Language Models with External Evaluation	Feb 22, 2024	Multiple-choice	—Unverified
Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words	Mar 10, 2025	Multiple-choice	—Unverified
IIE-NLP-Eyas at SemEval-2021 Task 4: Enhancing PLM for ReCAM with Special Tokens, Re-Ranking, Siamese Encoders and Back Translation	Feb 25, 2021	Multiple-choiceQuestion Answering	—Unverified
IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template Reconstruction Strategy for ComVE	Jul 2, 2020	Multiple-choiceQuestion Answering	—Unverified
IllusionBench: A Large-scale and Comprehensive Benchmark for Visual Illusion Understanding in Vision-Language Models	Jan 1, 2025	HallucinationMultiple-choice	—Unverified
Image Aesthetic Reasoning: A New Benchmark for Medical Image Screening with MLLMs	May 29, 2025	Image GenerationMultiple-choice	—Unverified
Imagery as Inquiry: Exploring A Multimodal Dataset for Conversational Recommendation	May 23, 2024	Conversational RecommendationMultiple-choice	—Unverified

Show:10 25 50

← PrevPage 27 of 45Next →

No leaderboard results yet.