SOTAVerified

Multiple-choice

Papers

Showing 491500 of 1107 papers

TitleStatusHype
Multiple Choice Learning for Efficient Speech Separation with Many Speakers0
NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects?0
SAGEval: The frontiers of Satisfactory Agent based NLG Evaluation for reference-free open-ended text0
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis0
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset0
VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation0
Testing Uncertainty of Large Language Models for Physics Knowledge and Reasoning0
A Benchmark for Long-Form Medical Question AnsweringCode0
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in BiomedicineCode0
TRACE: Transformer-based Risk Assessment for Clinical EvaluationCode0
Show:102550
← PrevPage 50 of 111Next →

No leaderboard results yet.