SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 481–490 of 1107 papers

Title	Date	Tasks	Status	Hype
Establishing Task Scaling Laws via Compute-Efficient Model Ladders	Dec 5, 2024	Language ModelingLanguage Modelling	—Unverified	0
Unlocking Video-LLM via Agent-of-Thoughts Distillation	Dec 2, 2024	Language ModelingLanguage Modelling	—Unverified	0
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models	Dec 2, 2024	MMLUMultiple-choice	CodeCode Available	0
The use of large language models to enhance cancer clinical trial educational materials	Dec 2, 2024	MisinformationMultiple-choice	—Unverified	0
KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting	Dec 1, 2024	Multiple-choiceMultiple Choice Question Answering (MCQA)	CodeCode Available	0
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages	Dec 1, 2024	ARCMultiple-choice	—Unverified	0
Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments	Nov 30, 2024	Multiple-choice	—Unverified	0
Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark	Nov 29, 2024	BenchmarkingGrounded Video Question Answering	—Unverified	0
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers	Nov 28, 2024	Image Captioningimage-classification	—Unverified	0
Applying IRT to Distinguish Between Human and Generative AI Responses to Multiple-Choice Assessments	Nov 28, 2024	Multiple-choice	—Unverified	0

Show:10 25 50

← PrevPage 49 of 111Next →

No leaderboard results yet.