SOTAVerified

Multiple-choice

Papers

Showing 981990 of 1107 papers

TitleStatusHype
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language ModelsCode0
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context RetrievalCode0
Moving Beyond Medical Exam Questions: A Clinician-Annotated Dataset of Real-World Tasks and Ambiguity in Mental HealthcareCode0
Uncertainty quantification in fine-tuned LLMs using LoRA ensemblesCode0
Evaluating and Mitigating Social Bias for Large Language Models in Open-ended SettingsCode0
Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science ExamCode0
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language ModelsCode0
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning ApproachCode0
Evaluating Large Language Model Biases in Persona-Steered GenerationCode0
SeqSAM: Autoregressive Multiple Hypothesis Prediction for Medical Image Segmentation using SAMCode0
Show:102550
← PrevPage 99 of 111Next →

No leaderboard results yet.