SOTAVerified

Multiple-choice

Papers

Showing 851900 of 1107 papers

TitleStatusHype
On the application of Transformers for estimating the difficulty of Multiple-Choice Questions from text0
On the Performance of Multimodal Language Models0
On the Principles behind Opinion Dynamics in Multi-Agent Systems of Large Language Models0
On the Reasoning Capacity of AI Models and How to Quantify It0
AGenT Zero: Zero-shot Automatic Multiple-Choice Question Generation for Skill Assessments0
VideoMCC: a New Benchmark for Video Comprehension0
Optimal Weighting for Exam Composition0
Option Comparison Network for Multiple-choice Reading Comprehension0
Options-Aware Dense Retrieval for Multiple-Choice query Answering0
Video Question Answering via Attribute-Augmented Attention Network Learning0
ViLLM-Eval: A Comprehensive Evaluation Suite for Vietnamese Large Language Models0
Order Independence With Finetuning0
PADDLe: a Platform to Identify Complex Words for Learners of French as a Foreign Language (FFL)0
Paragraph Similarity Matches for Generating Multiple-choice Test Items0
VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models0
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset0
The AI Penalization Effect: People Reduce Compensation for Workers Who Use AI0
Perception Test 2023: A Summary of the First Challenge And Outcome0
Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark0
A Foundational Multimodal Vision Language AI Assistant for Human Pathology0
PerCul: A Story-Driven Cultural Evaluation of LLMs in Persian0
Performance of ChatGPT-3.5 and GPT-4 on the United States Medical Licensing Examination With and Without Distractions0
Performance of leading large language models in May 2025 in Membership of the Royal College of General Practitioners-style examination questions: a cross-sectional analysis0
PersianMedQA: Language-Centric Evaluation of LLMs in the Persian Medical Domain0
Personalised Feedback Framework for Online Education Programmes Using Generative AI0
PhysUniBench: An Undergraduate-Level Physics Reasoning Benchmark for Multimodal Models0
Vision-Language Models Do Not Understand Negation0
Predicting Item Survival for Multiple Choice Questions in a High-Stakes Medical Exam0
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model LeaderboardsCode0
Video Prediction via Selective SamplingCode0
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison FeedbackCode0
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language ModelsCode0
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language LearningCode0
Automating Turkish Educational Quiz Generation Using Large Language ModelsCode0
How Can We Diagnose and Treat Bias in Large Language Models for Clinical Decision-Making?Code0
Measuring Agreeableness Bias in Multimodal ModelsCode0
CSEPrompts: A Benchmark of Introductory Computer Science PromptsCode0
MedArabiQ: Benchmarking Large Language Models on Arabic Medical TasksCode0
MedG-KRP: Medical Graph Knowledge Representation ProbingCode0
How much do LLMs learn from negative examples?Code0
CNN for Text-Based Multiple Choice Question AnsweringCode0
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in BiomedicineCode0
Confident Multiple Choice LearningCode0
VisBias: Measuring Explicit and Implicit Social Biases in Vision Language ModelsCode0
A Simple Method for Commonsense ReasoningCode0
Chance-Constrained Multiple-Choice Knapsack Problem: Model, Algorithms, and ApplicationsCode0
Biomedical Entity Linking as Multiple Choice Question AnsweringCode0
(WhyPHI) Fine-Tuning PHI-3 for Multiple-Choice Question Answering: Methodology, Results, and ChallengesCode0
DE-COP: Detecting Copyrighted Content in Language Models Training DataCode0
Patent Figure Classification using Large Vision-language ModelsCode0
Show:102550
← PrevPage 18 of 23Next →

No leaderboard results yet.