SOTAVerified

Multiple-choice

Papers

Showing 10761100 of 1107 papers

TitleStatusHype
Indirect Identification of Psychosocial Risks from Natural Language0
Inferring from Logits: Exploring Best Practices for Decoding-Free Generative Candidate Selection0
Two-Turn Debate Doesn't Help Humans Answer Hard Reading Comprehension Questions0
InnerThoughts: Disentangling Representations and Predictions in Large Language Models0
InstructionBench: An Instructional Video Understanding Benchmark0
Instruction Tuning and CoT Prompting for Contextual Medical QA with LLMs0
Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh0
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages0
Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex Healthcare Question Answering0
Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation0
Investigating Data Contamination in Modern Benchmarks for Large Language Models0
Self-Assessment Tests are Unreliable Measures of LLM Personality0
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination0
Investigating Uncertainty Calibration of Aligned Language Models under the Multiple-Choice Setting0
WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts0
ISAAQ -- Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention0
ISAAQ - Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention0
Is This Collection Worth My LLM's Time? Automatically Measuring Information Potential in Text Corpora0
An Algorithm for Generating Gap-Fill Multiple Choice Questions of an Expert System0
It is Too Many Options: Pitfalls of Multiple-Choice Questions in Generative AI and Medical Education0
Winning Amazon KDD Cup'240
KMMLU: Measuring Massive Multitask Language Understanding in Korean0
Knowledge-Driven Distractor Generation for Cloze-style Multiple Choice Questions0
Knowledge Questions from Knowledge Graphs0
Knowledge Retrieval Based on Generative AI0
Show:102550
← PrevPage 44 of 45Next →

No leaderboard results yet.