SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 161 papers

Title	Date	Tasks	Status	Score
A Variational Inequality Perspective on Generative Adversarial Networks	Feb 28, 2018	Misconceptions	CodeCode Available	5
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation	Mar 12, 2025	counterfactualMisconceptions	CodeCode Available	5
Clarify: Improving Model Robustness With Natural Language Corrections	Feb 6, 2024	Misconceptionsmodel	CodeCode Available	5
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors	May 2, 2025	High School PhysicsMisconceptions	CodeCode Available	5
Hindsight and Sequential Rationality of Correlated Play	Dec 10, 2020	counterfactualDecision Making	CodeCode Available	5
Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute Misconceptions	Oct 3, 2023	MisconceptionsMultiple-choice	CodeCode Available	5
From Solution Synthesis to Student Attempt Synthesis for Block-Based Visual Programming Tasks	May 3, 2022	MisconceptionsProgram Synthesis	CodeCode Available	5
How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions	Feb 1, 2019	Bilingual Lexicon InductionCross-Lingual Transfer	CodeCode Available	5
Large Language Models for In-Context Student Modeling: Synthesizing Student's Behavior in Visual Programming	Oct 15, 2023	Misconceptions	CodeCode Available	5
End-to-End Annotator Bias Approximation on Crowdsourced Single-Label Sentiment Analysis	Nov 3, 2021	MisconceptionsSentiment Analysis	CodeCode Available	5

Show:10 25 50

← PrevPage 3 of 17Next →

No leaderboard results yet.