Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 701–750 of 1107 papers

Title	Date	Tasks	Status
SecBench: A Comprehensive Multi-Dimensional Benchmarking Dataset for LLMs in Cybersecurity	Dec 30, 2024	BenchmarkingCode Generation	—Unverified
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models	Feb 25, 2025	Continual LearningGSM8K	—Unverified
Advanced Financial Reasoning at Scale: A Comprehensive Evaluation of Large Language Models on CFA Level III	Jun 29, 2025	Model SelectionMultiple-choice	—Unverified
Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models	Oct 18, 2024	FairnessMultiple-choice	—Unverified
From Human Days to Machine Seconds: Automatically Answering and Generating Machine Learning Final Exams	Jun 11, 2022	BIG-bench Machine LearningFew-Shot Learning	—Unverified
A Data-Driven Study of Commonsense Knowledge using the ConceptNet Knowledge Base	Nov 28, 2020	ClusteringGraph Representation Learning	—Unverified
Seeing the Forest and the Trees: Solving Visual Graph and Tree Based Data Structure Problems using Large Multimodal Models	Dec 15, 2024	Multiple-choice	—Unverified
Selective Particle Attention: Visual Feature-Based Attention in Deep Reinforcement Learning	Aug 26, 2020	Deep Reinforcement LearningMultiple-choice	—Unverified
Self-Evaluation Improves Selective Generation in Large Language Models	Dec 14, 2023	Multiple-choiceTruthfulQA	—Unverified
Adaptive Wizard for Removing Cross-Tier Misconfigurations in Active Directory	May 2, 2025	Multiple-choice	—Unverified
Self-supervised pre-training and contrastive representation learning for multiple-choice video QA	Sep 17, 2020	Auxiliary LearningContrastive Learning	—Unverified
Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question-Answering Data	Feb 1, 2021	Machine Reading ComprehensionMultiple-choice	—Unverified
Semi-automatic Generation of Multiple-Choice Tests from Mentions of Semantic Relations	Jul 1, 2015	Multiple-choiceReading Comprehension	—Unverified
Separation of Powers: On Segregating Knowledge from Observation in LLM-enabled Knowledge-based Visual Question Answering	Jan 1, 2025	Multiple-choiceQuestion Answering	—Unverified
Set-LLM: A Permutation-Invariant LLM	May 21, 2025	Multiple-choiceQuestion Answering	—Unverified
Setting Standards in Turkish NLP: TR-MMLU for Large Language Model Evaluation	Dec 31, 2024	Language Model EvaluationLanguage Modeling	—Unverified
Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions	Apr 11, 2022	Multiple-choiceReading Comprehension	—Unverified
Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations	Mar 10, 2025	FormMultiple-choice	—Unverified
Social IQa: Commonsense Reasoning about Social Interactions	Nov 1, 2019	Multiple-choiceQuestion Answering	—Unverified
Solving Visual Madlibs with Multiple Cues	Aug 11, 2016	Activity PredictionAttribute	—Unverified
SOSBENCH: Benchmarking Safety Alignment on Scientific Knowledge	May 27, 2025	BenchmarkingMultiple-choice	—Unverified
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers	Nov 28, 2024	Image Captioningimage-classification	—Unverified
Spending Money Wisely: Online Electronic Coupon Allocation based on Real-Time User Intent Detection	Aug 23, 2020	Intent DetectionMultiple-choice	—Unverified
VUDG: A Dataset for Video Understanding Domain Generalization	May 30, 2025	Domain GeneralizationMultiple-choice	—Unverified
SPRITE: A Response Model For Multiple Choice Testing	Jan 12, 2015	modelMultiple-choice	—Unverified
Weighted Global Normalization for Multiple Choice Reading Comprehension over Long Documents	Dec 5, 2018	Answer SelectionMultiple-choice	—Unverified
Recent Advances in Multi-Choice Machine Reading Comprehension: A Survey on Methods and Datasets	Aug 4, 2024	Few-Shot LearningMachine Reading Comprehension	—Unverified
Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework	Mar 7, 2025	Conformal PredictionMedical Question Answering	—Unverified
Statistically Profiling Biases in Natural Language Reasoning Datasets and Models	Feb 9, 2021	Multiple-choiceNatural Language Understanding	—Unverified
Adaptive Crowdsourcing Algorithms for the Bandit Survey Problem	Feb 13, 2013	Information RetrievalMultiple-choice	—Unverified
Stick to your Role! Stability of Personal Values Expressed in Large Language Models	Feb 19, 2024	Multiple-choice	—Unverified
Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles	Jun 24, 2016	Multiple-choice	—Unverified
Adapting Vision-Language Models for Evaluating World Models	Jun 22, 2025	Action RecognitionMultimodal Reasoning	—Unverified
Strategyproof Mean Estimation from Multiple-Choice Questions	Jan 1, 2020	Multiple-choice	—Unverified
Structured Outputs Enable General-Purpose LLMs to be Medical Experts	Mar 5, 2025	Clinical KnowledgeMedical Question Answering	—Unverified
What does BERT Learn from Multiple-Choice Reading Comprehension Datasets?	Oct 28, 2019	Multiple-choiceReading Comprehension	—Unverified
Superhuman performance of a large language model on the reasoning tasks of a physician	Dec 14, 2024	DiagnosticLanguage Modeling	—Unverified
What do we expect from Multiple-choice QA Systems?	Nov 20, 2020	Multiple-choiceMultiple Choice Question Answering (MCQA)	—Unverified
What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets	Jul 7, 2020	Multiple-choiceQuestion Answering	—Unverified
Susu Box or Piggy Bank: Assessing Cultural Commonsense Knowledge between Ghana and the U.S	Oct 21, 2024	Multiple-choice	—Unverified
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference	Aug 16, 2018	Common Sense ReasoningMultiple-choice	—Unverified
SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages	Jun 20, 2024	Language ModellingLarge Language Model	—Unverified
TabMCQ: A Dataset of General Knowledge Tables and Multiple-choice Questions	Feb 12, 2016	General KnowledgeMultiple-choice	—Unverified
TA-MAMC at SemEval-2021 Task 4: Task-adaptive Pretraining and Multi-head Attention for Abstract Meaning Reading Comprehension	Aug 1, 2021	Contrastive LearningMultiple-choice	—Unverified
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling	Sep 30, 2024	Language ModelingLanguage Modelling	—Unverified
TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine	May 29, 2025	DiagnosticMultiple-choice	—Unverified
Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students' (Mis)Understanding Is Hinted	May 9, 2025	Language ModelingLanguage Modelling	—Unverified
Empowering Sentence Encoders with Prompting and Label Retrieval for Zero-shot Text Classification	Dec 20, 2022	ClassificationDescriptive	—Unverified
Testing Uncertainty of Large Language Models for Physics Knowledge and Reasoning	Nov 18, 2024	Logical ReasoningMultiple-choice	—Unverified
Answering Chinese Elementary School Social Studies Multiple Choice Questions	Dec 1, 2021	Multiple-choice	—Unverified

Show:10 25 50

← PrevPage 15 of 23Next →

No leaderboard results yet.