SOTAVerified

Multiple-choice

Papers

Showing 301325 of 1107 papers

TitleStatusHype
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language ModelsCode0
Can Model Uncertainty Function as a Proxy for Multiple-Choice Question Item Difficulty?Code0
A quantitative study of NLP approaches to question difficulty estimationCode0
Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute MisconceptionsCode0
A Joint Sequence Fusion Model for Video Question Answering and RetrievalCode0
LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMsCode0
From Multiple-Choice to Extractive QA: A Case Study for English and ArabicCode0
AILS-NTUA at SemEval-2024 Task 9: Cracking Brain Teasers: Transformer Models for Lateral Thinking PuzzlesCode0
Sentence Embeddings for Russian NLUCode0
BUCA: A Binary Classification Approach to Unsupervised Commonsense Question AnsweringCode0
LiveQA: A Question Answering Dataset over Sports LiveCode0
Limited Ability of LLMs to Simulate Human Psychological Behaviours: a Psychometric AnalysisCode0
LLaVA-OneVision: Easy Visual Task TransferCode0
Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You ThinkCode0
Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice QuestionsCode0
Answer-level Calibration for Free-form Multiple Choice Question AnsweringCode0
HSI: Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language ModelsCode0
Towards Efficient Methods in Medical Question Answering using Knowledge Graph EmbeddingsCode0
BnMMLU: Measuring Massive Multitask Language Understanding in BengaliCode0
Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answersCode0
LEAVS: An LLM-based Labeler for Abdominal CT SupervisionCode0
Learning to Reuse Distractors to support Multiple Choice Question Generation in EducationCode0
Length Optimization in Conformal PredictionCode0
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question AnsweringCode0
Biomedical Entity Linking as Multiple Choice Question AnsweringCode0
Show:102550
← PrevPage 13 of 45Next →

No leaderboard results yet.