SOTAVerified

Multiple-choice

Papers

Showing 291300 of 1107 papers

TitleStatusHype
StoryTeller: Improving Long Video Description through Global Audio-Visual Character IdentificationCode2
Probabilistic Consensus through Ensemble Validation: A Framework for LLM Reliability0
Humans and Large Language Models in Clinical Decision Support: A Study with Medical Calculators0
Quantitative Assessment of Intersectional Empathetic Bias and UnderstandingCode0
ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding0
HourVideo: 1-Hour Video-Language UnderstandingCode2
MEG: Medical Knowledge-Augmented Large Language Models for Question AnsweringCode1
MILU: A Multi-task Indic Language Understanding BenchmarkCode1
FactTest: Factuality Testing in Large Language Models with Finite-Sample and Distribution-Free Guarantees0
PPLLaVA: Varied Video Sequence Understanding With Prompt GuidanceCode2
Show:102550
← PrevPage 30 of 111Next →

No leaderboard results yet.