SOTAVerified

Multiple-choice

Papers

Showing 276300 of 1107 papers

TitleStatusHype
A Multiple Choices Reading Comprehension Corpus for Vietnamese Language EducationCode0
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?Code0
Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You ThinkCode0
CLOMO: Counterfactual Logical Modification with Large Language ModelsCode0
ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and ReasoningCode0
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language ModelsCode0
LLaVA-OneVision: Easy Visual Task TransferCode0
Limited Ability of LLMs to Simulate Human Psychological Behaviours: a Psychometric AnalysisCode0
LiveQA: A Question Answering Dataset over Sports LiveCode0
LLMs Are Not Intelligent Thinkers: Introducing Mathematical Topic Tree Benchmark for Comprehensive Evaluation of LLMsCode0
ChatGPT for GTFS: Benchmarking LLMs on GTFS Understanding and RetrievalCode0
Are Vision LLMs Road-Ready? A Comprehensive Benchmark for Safety-Critical Driving Video UnderstandingCode0
Leveraging large language models for nano synthesis mechanism explanation: solid foundations or mere conjectures?Code0
Chain-of-Exemplar: Enhancing Distractor Generation for Multimodal Educational Question GenerationCode0
Are Large Language Models Consistent over Value-laden Questions?Code0
Towards Efficient Methods in Medical Question Answering using Knowledge Graph EmbeddingsCode0
HSI: Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language ModelsCode0
LEAVS: An LLM-based Labeler for Abdominal CT SupervisionCode0
Length Optimization in Conformal PredictionCode0
CASE: Commonsense-Augmented Score with an Expanded Answer SpaceCode0
Cascading Biases: Investigating the Effect of Heuristic Annotation Strategies on Data and ModelsCode0
Abductive Commonsense ReasoningCode0
A large language model-assisted education tool to provide feedback on open-ended responsesCode0
Can We Guide a Multi-Hop Reasoning Language Model to Incrementally Learn at Each Single-Hop?Code0
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning DistractorCode0
Show:102550
← PrevPage 12 of 45Next →

No leaderboard results yet.