SOTAVerified

Multiple-choice

Papers

Showing 401410 of 1107 papers

TitleStatusHype
Adversarial Databases Improve Success in Retrieval-based Large Language Models0
TurkishMMLU: Measuring Massive Multitask Language Understanding in TurkishCode1
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models0
Fine-tuning Multimodal Large Language Models for Product BundlingCode1
Uncertainty is Fragile: Manipulating Uncertainty in Large Language ModelsCode1
AstroMLab 1: Who Wins Astronomy Jeopardy!?0
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models0
LAB-Bench: Measuring Capabilities of Language Models for Biology Research0
Leveraging large language models for nano synthesis mechanism explanation: solid foundations or mere conjectures?Code0
Evaluating Nuanced Bias in Large Language Model Free Response Answers0
Show:102550
← PrevPage 41 of 111Next →

No leaderboard results yet.