SOTAVerified

Multiple-choice

Papers

Showing 901910 of 1107 papers

TitleStatusHype
A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation using GPTCode0
Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute MisconceptionsCode0
DefAn: Definitive Answer Dataset for LLMs Hallucination EvaluationCode0
ToMChallenges: A Principle-Guided Dataset and Diverse Evaluation Tasks for Exploring Theory of MindCode0
CLOMO: Counterfactual Logical Modification with Large Language ModelsCode0
IdentifyMe: A Challenging Long-Context Mention Resolution Benchmark for LLMsCode0
DetectBench: Can Large Language Model Detect and Piece Together Implicit Evidence?Code0
SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer SecurityCode0
What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?Code0
Assessing the Quality of Multiple-Choice Questions Using GPT-4 and Rule-Based MethodsCode0
Show:102550
← PrevPage 91 of 111Next →

No leaderboard results yet.