SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 141–150 of 1107 papers

Title	Date	Tasks	Status	Hype
Multiple-Choice Questions are Efficient and Robust LLM Evaluators	May 20, 2024	GSM8KHumanEval	CodeCode Available	1
SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation	May 14, 2024	BenchmarkingMultiple-choice	CodeCode Available	1
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models	May 8, 2024	AttributeData Augmentation	CodeCode Available	1
Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom	Apr 30, 2024	ImplicaturesMultiple-choice	CodeCode Available	1
Latxa: An Open Language Model and Evaluation Suite for Basque	Mar 29, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Non-Linear Inference Time Intervention: Improving LLM Truthfulness	Mar 27, 2024	Large Language ModelMultiple-choice	CodeCode Available	1
IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models	Mar 23, 2024	Common Sense ReasoningIn-Context Learning	CodeCode Available	1
Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs	Mar 12, 2024	Knowledge GraphsMultiple-choice	CodeCode Available	1
Unfamiliar Finetuning Examples Control How Language Models Hallucinate	Mar 8, 2024	MMLUMultiple-choice	CodeCode Available	1
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering	Mar 4, 2024	MedQAMMLU	CodeCode Available	1

Show:10 25 50

← PrevPage 15 of 111Next →

No leaderboard results yet.