SOTAVerified

Multiple-choice

Papers

Showing 611620 of 1107 papers

TitleStatusHype
The Earth is Flat? Unveiling Factual Errors in Large Language Models0
FusionMind -- Improving question and answering with external context fusion0
SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer SecurityCode0
RoleEval: A Bilingual Role Evaluation Benchmark for Large Language ModelsCode1
HyKGE: A Hypothesis Knowledge Graph Enhanced Framework for Accurate and Reliable Medical LLMs ResponsesCode1
Towards a Unified Multimodal Reasoning FrameworkCode0
Perception Test 2023: A Summary of the First Challenge And Outcome0
BloomVQA: Assessing Hierarchical Multi-modal Comprehension0
Multiple Hypothesis Dropout: Estimating the Parameters of Multi-Modal Output DistributionsCode0
An In-depth Look at Gemini's Language AbilitiesCode1
Show:102550
← PrevPage 62 of 111Next →

No leaderboard results yet.