SOTAVerified
|
Agents
Browse
Leaderboard
About
Tasks
›
Multiple-choice
Multiple-choice
Papers
Recently Added
Most Hyped
Most Active
Needs Verification
Most Verified
Showing 1101–1107 of 1107 papers
Title
Date
Tasks
Status
Hype
Score
KoBALT: Korean Benchmark For Advanced Linguistic Tasks
May 22, 2025
Multiple-choice
—
Unverified
0
0
KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations
Mar 3, 2024
MedQA
MMLU
—
Unverified
0
0
KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge
Feb 21, 2024
4k
Multiple-choice
—
Unverified
0
0
KRISTEVA: Close Reading as a Novel Task for Benchmarking Interpretive Reasoning
May 14, 2025
Benchmarking
MMLU
—
Unverified
0
0
LAB-Bench: Measuring Capabilities of Language Models for Biology Research
Jul 14, 2024
Language Modelling
Multiple-choice
—
Unverified
0
0
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs
Oct 18, 2024
Benchmarking
Fairness
—
Unverified
0
0
Language Enhanced Model for Eye (LEME): An Open-Source Ophthalmology-Specific Large Language Model
Oct 1, 2024
All
Language Modeling
—
Unverified
0
0
Show:
10
25
50
← Prev
Page 23 of 23
Next →
No leaderboard results yet.