SOTAVerified

Multiple-choice

Papers

Showing 726750 of 1107 papers

TitleStatusHype
SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer SecurityCode0
Towards a Unified Multimodal Reasoning FrameworkCode0
Perception Test 2023: A Summary of the First Challenge And Outcome0
BloomVQA: Assessing Hierarchical Multi-modal Comprehension0
Multiple Hypothesis Dropout: Estimating the Parameters of Multi-Modal Output DistributionsCode0
Self-Evaluation Improves Selective Generation in Large Language Models0
A Foundational Multimodal Vision Language AI Assistant for Human Pathology0
A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education0
Unleashing the Potential of Large Language Model: Zero-shot VQA for Flood Disaster Scenario0
Explanatory Argument Extraction of Correct Answers in Resident Medical ExamsCode0
Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension0
CLOMO: Counterfactual Logical Modification with Large Language ModelsCode0
ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology0
Investigating Data Contamination in Modern Benchmarks for Large Language Models0
Downstream Trade-offs of a Family of Text WatermarksCode0
Evaluating LLMs on Document-Based QA: Exact Answer Selection and Numerical Extraction using Cogtale dataset0
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination ReasoningCode0
Characterizing Large Language Models as Rationalizers of Knowledge-intensive Tasks0
Assessing Distractors in Multiple-Choice Tests0
Evaluating multiple large language models in pediatric ophthalmology0
Evaluating the Potential of Leading Large Language Models in Reasoning Biology Questions0
More Robots are Coming: Large Multimodal Models (ChatGPT) can Solve Visually Diverse Images of Parsons Problems0
CASE: Commonsense-Augmented Score with an Expanded Answer SpaceCode0
DeSIQ: Towards an Unbiased, Challenging Benchmark for Social Intelligence Understanding0
POE: Process of Elimination for Multiple Choice ReasoningCode0
Show:102550
← PrevPage 30 of 45Next →

No leaderboard results yet.