SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 161–170 of 1107 papers

Title	Date	Tasks	Status	Hype
Fundamental Limitations in Defending LLM Finetuning APIs	Feb 20, 2025	Multiple-choice	—Unverified	0
Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh	Feb 19, 2025	Instruction FollowingMultiple-choice	—Unverified	0
Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above	Feb 19, 2025	AllMultiple-choice	—Unverified	0
Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora	Feb 19, 2025	ArticlesMultiple-choice	—Unverified	0
VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare	Feb 19, 2025	BenchmarkingDiversity	—Unverified	0
Towards Geo-Culturally Grounded LLM Generations	Feb 19, 2025	Multiple-choiceRetrieval-augmented Generation	—Unverified	0
OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities	Feb 18, 2025	Large Language ModelMultiple-choice	—Unverified	0
None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks	Feb 18, 2025	MathMemorization	—Unverified	0
Beyond Profile: From Surface-Level Facts to Deep Persona Simulation in LLMs	Feb 18, 2025	Generative Question AnsweringMultiple-choice	—Unverified	0
Multi-Modal Retrieval Augmentation for Open-Ended and Knowledge-Intensive Video Question Answering	Feb 17, 2025	Multiple-choiceQuestion Answering	—Unverified	0

Show:10 25 50

← PrevPage 17 of 111Next →

No leaderboard results yet.