Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–275 of 1107 papers

Title	Date	Tasks	Status	Hype
Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models	Nov 10, 2023	GSM8KMemorization	CodeCode Available	1
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding	Aug 17, 2023	DiagnosticEgoSchema	CodeCode Available	1
Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation	Sep 19, 2023	Language Model EvaluationLanguage Modeling	CodeCode Available	1
Explicit Planning Helps Language Models in Logical Reasoning	Mar 28, 2023	Logical ReasoningMultiple-choice	CodeCode Available	1
InfiniBench: A Comprehensive Benchmark for Large Multimodal Models in Very Long Video Understanding	Jun 28, 2024	Multiple-choiceVideo Understanding	CodeCode Available	1
ORAN-Bench-13K: An Open Source Benchmark for Assessing LLMs in Open Radio Access Networks	Jul 8, 2024	Anomaly DetectionCode Generation	CodeCode Available	1
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models	Nov 27, 2024	BenchmarkingEarth Observation	CodeCode Available	1
Controlling Cloze-test Question Item Difficulty with PLM-based Surrogate Models for IRT Assessment	Mar 3, 2024	Cloze TestMultiple-choice	—Unverified	0
Contextual Response Interpretation for Automated Structured Interviews: A Case Study in Market Research	Apr 30, 2023	MarketingMultiple-choice	—Unverified	0
Analysing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets	Sep 29, 2021	Language ModellingMachine Reading Comprehension	—Unverified	0
Context Modeling with Evidence Filter for Multiple Choice Question Answering	Oct 6, 2020	Machine Reading ComprehensionMultiple-choice	—Unverified	0
Context-guided Triple Matching for Multiple Choice Question Answering	Jan 16, 2022	BenchmarkingMultiple-choice	—Unverified	0
AstroMLab 1: Who Wins Astronomy Jeopardy!?	Jul 15, 2024	AstronomyBenchmarking	—Unverified	0
E-Commerce Promotions Personalization via Online Multiple-Choice Knapsack with Uplift Modeling	Aug 11, 2021	Multiple-choice	—Unverified	0
Context-guided Triple Matching for Multiple Choice Question Answering	Sep 27, 2021	BenchmarkingMultiple-choice	—Unverified	0
A statistical model for aggregating judgments by incorporating peer predictions	Mar 14, 2017	counterfactualMultiple-choice	—Unverified	0
Advanced Financial Reasoning at Scale: A Comprehensive Evaluation of Large Language Models on CFA Level III	Jun 29, 2025	Model SelectionMultiple-choice	—Unverified	0
Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models	Oct 18, 2024	FairnessMultiple-choice	—Unverified	0
Edinburgh Clinical NLP at MEDIQA-CORR 2024: Guiding Large Language Models with Hints	May 28, 2024	Multiple-choiceSentence	—Unverified	0
Confidence-Aware Learning Assistant	Feb 15, 2021	Multiple-choice	—Unverified	0
Comparative Study of Learning Outcomes for Online Learning Platforms	Apr 15, 2021	Active LearningMultiple-choice	—Unverified	0
Assessing Large Language Models in Mechanical Engineering Education: A Study on Mechanics-Focused Conceptual Understanding	Jan 13, 2024	Multiple-choicePrompt Engineering	—Unverified	0
An Algorithm for Generating Gap-Fill Multiple Choice Questions of an Expert System	Sep 17, 2021	Multiple-choicesoftware testing	—Unverified	0
Combining Multiple Cues for Visual Madlibs Question Answering	Nov 1, 2016	AttributeGeneral Classification	—Unverified	0
Combinatorial framework for planning in geological exploration	Jan 22, 2018	AttributeMultiple-choice	—Unverified	0

Show:10 25 50

← PrevPage 11 of 45Next →

No leaderboard results yet.