SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 161 papers

Title	Date	Tasks	Status	Score
Design Challenges and Misconceptions in Neural Sequence Labeling	Jun 12, 2018	ChunkingMisconceptions	CodeCode Available	5
Scalable and Equitable Math Problem Solving Strategy Prediction in Big Educational Data	Aug 7, 2023	MathMisconceptions	CodeCode Available	5
Pay attention to your loss: understanding misconceptions about 1-Lipschitz neural networks	Apr 11, 2021	General ClassificationMisconceptions	CodeCode Available	5
Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses	Nov 10, 2019	AllBayesian Inference	CodeCode Available	5
Large Language Models for In-Context Student Modeling: Synthesizing Student's Behavior in Visual Programming	Oct 15, 2023	Misconceptions	CodeCode Available	5
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor	Dec 8, 2024	MisconceptionsMultiple-choice	CodeCode Available	5
How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions	Feb 1, 2019	Bilingual Lexicon InductionCross-Lingual Transfer	CodeCode Available	5
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors	May 2, 2025	High School PhysicsMisconceptions	CodeCode Available	5
A Structured Unplugged Approach for Foundational AI Literacy in Primary Education	May 27, 2025	Logical ReasoningMisconceptions	CodeCode Available	5
Hindsight and Sequential Rationality of Correlated Play	Dec 10, 2020	counterfactualDecision Making	CodeCode Available	5
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models	Apr 2, 2024	Distractor GenerationIn-Context Learning	CodeCode Available	5
Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning	Aug 7, 2023	In-Context LearningMath	CodeCode Available	5
End-to-End Annotator Bias Approximation on Crowdsourced Single-Label Sentiment Analysis	Nov 3, 2021	MisconceptionsSentiment Analysis	CodeCode Available	5
Community detection in networks: A user guide	Jul 30, 2016	Community DetectionMisconceptions	CodeCode Available	5
A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice	Apr 25, 2024	Misconceptions	CodeCode Available	5
EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning	Sep 16, 2023	Date UnderstandingGSM8K	CodeCode Available	5
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation	Mar 12, 2025	counterfactualMisconceptions	CodeCode Available	5
A Weakly-Supervised Iterative Graph-Based Approach to Retrieve COVID-19 Misinformation Topics	May 19, 2022	MisconceptionsMisinformation	CodeCode Available	5
Collecting the Public Perception of AI and Robot Rights	Aug 4, 2020	Misconceptions	CodeCode Available	5
MalAlgoQA: Pedagogical Evaluation of Counterfactual Reasoning in Large Language Models and Implications for AI in Education	Jul 1, 2024	counterfactualCounterfactual Reasoning	CodeCode Available	5
Deep Curvature Suite	Dec 20, 2019	Misconceptions	CodeCode Available	5
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions	Jun 27, 2024	Distractor GenerationMath	CodeCode Available	5
From Solution Synthesis to Student Attempt Synthesis for Block-Based Visual Programming Tasks	May 3, 2022	MisconceptionsProgram Synthesis	CodeCode Available	5
Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics and Prompt Wording	Jun 9, 2023	Misconceptions	CodeCode Available	5
Paths and Ambient Spaces in Neural Loss Landscapes	Mar 5, 2025	Misconceptions	CodeCode Available	5

Show:10 25 50

← PrevPage 2 of 7Next →

No leaderboard results yet.