SOTAVerified

Multiple-choice

Papers

Showing 701725 of 1107 papers

TitleStatusHype
Uncertainty quantification in fine-tuned LLMs using LoRA ensemblesCode0
KMMLU: Measuring Massive Multitask Language Understanding in Korean0
Question-Instructed Visual Descriptions for Zero-Shot Video Question AnsweringCode0
DE-COP: Detecting Copyrighted Content in Language Models Training DataCode0
Prompting Implicit Discourse Relation Annotation0
SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark0
Are Machines Better at Complex Reasoning? Unveiling Human-Machine Inference Gaps in Entailment Verification0
Enhancing textual textbook question answering with large language models and retrieval augmented generationCode0
LLMs May Perform MCQA by Selecting the Least Incorrect Option0
Distractor Generation in Multiple-Choice Tasks: A Survey of Methods, Datasets, and Evaluation0
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model LeaderboardsCode0
An Information-Theoretic Approach to Analyze NLP Classification TasksCode0
Evaluating LLM -- Generated Multimodal Diagnosis from Medical Images and Symptom Analysis0
Towards Collective Superintelligence: Amplifying Group IQ using Conversational Swarms0
Instruction Fine-Tuning: Does Prompt Loss Matter?0
What Large Language Models Know and What People Think They Know0
Towards Efficient Methods in Medical Question Answering using Knowledge Graph EmbeddingsCode0
A Study on Large Language Models' Limitations in Multiple-Choice Question AnsweringCode0
Assessing Large Language Models in Mechanical Engineering Education: A Study on Mechanics-Focused Conceptual Understanding0
Automated Answer Validation using Text Similarity0
A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation using GPTCode0
PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities0
A Joint-Reasoning based Disease Q&A System0
The Earth is Flat? Unveiling Factual Errors in Large Language Models0
FusionMind -- Improving question and answering with external context fusion0
Show:102550
← PrevPage 29 of 45Next →

No leaderboard results yet.