SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 591–600 of 1107 papers

Title	Date	Tasks	Status	Hype
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards	Feb 1, 2024	Answer SelectionLanguage Modeling	CodeCode Available	0
An Information-Theoretic Approach to Analyze NLP Classification Tasks	Feb 1, 2024	Multiple-choiceReading Comprehension	CodeCode Available	0
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench	Jan 31, 2024	BenchmarkingMultiple-choice	CodeCode Available	4
E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models	Jan 29, 2024	EthicsMultiple-choice	CodeCode Available	1
Evaluating LLM -- Generated Multimodal Diagnosis from Medical Images and Symptom Analysis	Jan 28, 2024	Knowledge GraphsMedical Diagnosis	—Unverified	0
Improving Medical Reasoning through Retrieval and Self-Reflection with Retrieval-Augmented Large Language Models	Jan 27, 2024	Medical Question AnsweringMultiple-choice	CodeCode Available	2
Towards Collective Superintelligence: Amplifying Group IQ using Conversational Swarms	Jan 25, 2024	Decision MakingMultiple-choice	—Unverified	0
LongHealth: A Question Answering Benchmark with Long Clinical Documents	Jan 25, 2024	Information RetrievalMultiple-choice	CodeCode Available	1
CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning	Jan 25, 2024	Multiple-choicePosition	CodeCode Available	1
What Large Language Models Know and What People Think They Know	Jan 24, 2024	ArticlesDecision Making	—Unverified	0

Show:10 25 50

← PrevPage 60 of 111Next →

No leaderboard results yet.