SOTAVerified

Multiple-choice

Papers

Showing 731740 of 1107 papers

TitleStatusHype
Self-Evaluation Improves Selective Generation in Large Language Models0
A Foundational Multimodal Vision Language AI Assistant for Human Pathology0
A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education0
Unleashing the Potential of Large Language Model: Zero-shot VQA for Flood Disaster Scenario0
Explanatory Argument Extraction of Correct Answers in Resident Medical ExamsCode0
Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension0
CLOMO: Counterfactual Logical Modification with Large Language ModelsCode0
ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology0
Investigating Data Contamination in Modern Benchmarks for Large Language Models0
Downstream Trade-offs of a Family of Text WatermarksCode0
Show:102550
← PrevPage 74 of 111Next →

No leaderboard results yet.