SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 141–150 of 161 papers

Title	Date	Tasks	Status
Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning	Aug 7, 2023	In-Context LearningMath	CodeCode Available
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models	Apr 2, 2024	Distractor GenerationIn-Context Learning	CodeCode Available
From Solution Synthesis to Student Attempt Synthesis for Block-Based Visual Programming Tasks	May 3, 2022	MisconceptionsProgram Synthesis	CodeCode Available
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors	May 2, 2025	High School PhysicsMisconceptions	CodeCode Available
Hindsight and Sequential Rationality of Correlated Play	Dec 10, 2020	counterfactualDecision Making	CodeCode Available
How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions	Feb 1, 2019	Bilingual Lexicon InductionCross-Lingual Transfer	CodeCode Available
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation	Mar 12, 2025	counterfactualMisconceptions	CodeCode Available
Large Language Models for In-Context Student Modeling: Synthesizing Student's Behavior in Visual Programming	Oct 15, 2023	Misconceptions	CodeCode Available
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor	Dec 8, 2024	MisconceptionsMultiple-choice	CodeCode Available
MalAlgoQA: Pedagogical Evaluation of Counterfactual Reasoning in Large Language Models and Implications for AI in Education	Jul 1, 2024	counterfactualCounterfactual Reasoning	CodeCode Available

Show:10 25 50

← PrevPage 15 of 17Next →

No leaderboard results yet.