SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 321–330 of 1107 papers

Title	Date	Tasks	Status	Hype
Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers	Oct 15, 2024	Multiple-choice	CodeCode Available	0
Not All Options Are Created Equal: Textual Option Weighting for Token-Efficient LLM-Based Knowledge Tracing	Oct 14, 2024	AllBinary Classification	—Unverified	0
Personalised Feedback Framework for Online Education Programmes Using Generative AI	Oct 14, 2024	BenchmarkingManagement	—Unverified	0
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models	Oct 14, 2024	Multiple-choice	CodeCode Available	1
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models	Oct 13, 2024	Multiple-choice	—Unverified	0
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models	Oct 13, 2024	HallucinationHallucination Evaluation	CodeCode Available	0
Taming Overconfidence in LLMs: Reward Calibration in RLHF	Oct 13, 2024	Multiple-choice	CodeCode Available	1
The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models	Oct 12, 2024	MisconceptionsMultiple-choice	—Unverified	0
SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models	Oct 11, 2024	Few-Shot LearningMultiple-choice	CodeCode Available	1
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models	Oct 11, 2024	Multiple-choiceTruthfulQA	CodeCode Available	0

Show:10 25 50

← PrevPage 33 of 111Next →

No leaderboard results yet.