SOTAVerified

Multiple-choice

Papers

Showing 691700 of 1107 papers

TitleStatusHype
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages0
Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex Healthcare Question Answering0
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation0
Investigating Data Contamination in Modern Benchmarks for Large Language Models0
Self-Assessment Tests are Unreliable Measures of LLM Personality0
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination0
Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting0
WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts0
ISAAQ -- Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention0
ISAAQ - Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention0
Show:102550
← PrevPage 70 of 111Next →

No leaderboard results yet.