Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–500 of 1107 papers

Title	Date	Tasks	Status	Score
DE-COP: Detecting Copyrighted Content in Language Models Training Data	Feb 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
An Automatic Question Usability Evaluation Toolkit	May 30, 2024	Multiple-choiceWord Embeddings	CodeCode Available	5
Language Models as Knowledge Bases for Visual Word Sense Disambiguation	Oct 3, 2023	Image CaptioningMultiple-choice	CodeCode Available	5
A Profit-Maximizing Strategy for Advertising on the e-Commerce Platforms	Oct 31, 2022	ManagementMultiple-choice	CodeCode Available	5
Automated Generation and Tagging of Knowledge Components from Multiple-Choice Questions	May 30, 2024	Language ModellingLarge Language Model	CodeCode Available	5
Truth Knows No Language: Evaluating Truthfulness Beyond English	Feb 13, 2025	InformativenessMachine Translation	CodeCode Available	5
Joint Learning of Sentence Embeddings for Relevance and Entailment	May 16, 2016	Decision MakingInformation Retrieval	CodeCode Available	5
Chance-Constrained Multiple-Choice Knapsack Problem: Model, Algorithms, and Applications	Jun 26, 2023	Combinatorial OptimizationMultiple-choice	CodeCode Available	5
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation	Apr 9, 2025	Multiple-choice	CodeCode Available	5
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning	Nov 13, 2023	Multiple-choice	CodeCode Available	5
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine	Nov 14, 2024	FormHallucination	CodeCode Available	5
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models	Oct 15, 2023	Multiple-choiceTriplet	CodeCode Available	5
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering	Aug 28, 2018	AI2 Reasoning ChallengeARC	CodeCode Available	5
iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers	May 25, 2024	Common Sense ReasoningMultiple-choice	CodeCode Available	5
Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning	Feb 8, 2025	Legal ReasoningMultiple-choice	CodeCode Available	5
CSEPrompts: A Benchmark of Introductory Computer Science Prompts	Apr 3, 2024	Multiple-choice	CodeCode Available	5
IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models	Jun 18, 2024	ManagementMultiple-choice	CodeCode Available	5
Utilizing Background Knowledge for Robust Reasoning over Traffic Situations	Dec 4, 2022	Knowledge GraphsMultiple-choice	CodeCode Available	5
Is Your Large Language Model Knowledgeable or a Choices-Only Cheater?	Jul 2, 2024	Graph MiningLanguage Modeling	CodeCode Available	5
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval	Oct 3, 2023	ArticlesDecision Making	CodeCode Available	5
Introducing a framework to assess newly created questions with Natural Language Processing	Apr 28, 2020	Multiple-choice	CodeCode Available	5
QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling	Sep 21, 2024	Multiple-choicePrompt Engineering	CodeCode Available	5
Introducing Flexible Monotone Multiple Choice Item Response Theory Models and Bit Scales	Oct 2, 2024	Multiple-choice	CodeCode Available	5
Iterative Forward Tuning Boosts In-Context Learning in Language Models	May 22, 2023	Decision MakingIn-Context Learning	CodeCode Available	5
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models	Jun 7, 2024	Multiple-choicePhilosophy	CodeCode Available	5
Improving Question Answering with External Knowledge	Feb 3, 2019	ARCMultiple-choice	CodeCode Available	5
Video Prediction via Selective Sampling	Dec 1, 2018	Multiple-choicePrediction	CodeCode Available	5
VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models	Mar 10, 2025	Image DescriptionMultiple-choice	CodeCode Available	5
Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy	May 24, 2023	In-Context LearningMultiple-choice	CodeCode Available	5
INCEPTNET: Precise And Early Disease Detection Application For Medical Images Analyses	Sep 5, 2023	Cell DetectionLesion Segmentation	CodeCode Available	5
A multimodal dataset for understanding the impact of mobile phones on remote online virtual education	Dec 13, 2024	EEGHead Pose Estimation	CodeCode Available	5
IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark for LLMs	Nov 12, 2024	coreference-resolutionCoreference Resolution	CodeCode Available	5
What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?	Jun 1, 2021	Multiple-choiceNatural Language Understanding	CodeCode Available	5
What Makes Reading Comprehension Questions Easier?	Aug 28, 2018	Machine Reading ComprehensionMultiple-choice	CodeCode Available	5
Improving Machine Reading Comprehension with General Reading Strategies	Oct 31, 2018	ARCLanguage Modeling	CodeCode Available	5
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor	Dec 8, 2024	MisconceptionsMultiple-choice	CodeCode Available	5
Controlling Cloze-test Question Item Difficulty with PLM-based Surrogate Models for IRT Assessment	Mar 3, 2024	Cloze TestMultiple-choice	—Unverified	0
Contextual Response Interpretation for Automated Structured Interviews: A Case Study in Market Research	Apr 30, 2023	MarketingMultiple-choice	—Unverified	0
Context Modeling with Evidence Filter for Multiple Choice Question Answering	Oct 6, 2020	Machine Reading ComprehensionMultiple-choice	—Unverified	0
Context-guided Triple Matching for Multiple Choice Question Answering	Jan 16, 2022	BenchmarkingMultiple-choice	—Unverified	0
AstroMLab 1: Who Wins Astronomy Jeopardy!?	Jul 15, 2024	AstronomyBenchmarking	—Unverified	0
Analysing the Effect of Masking Length Distribution of MLM: An Evaluation Framework and Case Study on Chinese MRC Datasets	Sep 29, 2021	Language ModellingMachine Reading Comprehension	—Unverified	0
HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method	Jun 1, 2022	Machine Reading ComprehensionMultiple-choice	—Unverified	0
Context-guided Triple Matching for Multiple Choice Question Answering	Sep 27, 2021	BenchmarkingMultiple-choice	—Unverified	0
How well do LLMs reason over tabular data, really?	May 12, 2025	Missing ValuesMultiple-choice	—Unverified	0
How Susceptible are LLMs to Influence in Prompts?	Aug 17, 2024	Multiple-choiceQuestion Answering	—Unverified	0
How Many Workers to Ask? Adaptive Exploration for Collecting High Quality Labels	Nov 1, 2014	Multiple-choice	—Unverified	0
A statistical model for aggregating judgments by incorporating peer predictions	Mar 14, 2017	counterfactualMultiple-choice	—Unverified	0
Advanced Financial Reasoning at Scale: A Comprehensive Evaluation of Large Language Models on CFA Level III	Jun 29, 2025	Model SelectionMultiple-choice	—Unverified	0
How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering?	Jun 19, 2025	Multiple-choiceQuestion Answering	—Unverified	0

Show:10 25 50

← PrevPage 10 of 23Next →

No leaderboard results yet.