SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 161 papers

Title	Date	Tasks	Status
A close-up comparison of the misclassification error distance and the adjusted Rand index for external clustering evaluation	Jul 26, 2019	ClusteringMisconceptions	—Unverified
Disproving XAI Myths with Formal Methods -- Initial Results	May 13, 2023	Explainable artificial intelligenceExplainable Artificial Intelligence (XAI)	—Unverified
Distortions in Judged Spatial Relations in Large Language Models	Jan 8, 2024	MisconceptionsSpatial Reasoning	—Unverified
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation	Dec 14, 2024	Misconceptions	—Unverified
Automated Identification of Logical Errors in Programs: Advancing Scalable Analysis of Student Misconceptions	May 16, 2025	Misconceptions	—Unverified
Automatic Generation of Question Hints for Mathematics Problems using Large Language Models in Educational Technology	Nov 5, 2024	MathMisconceptions	—Unverified
Emergent Abilities in Large Language Models: A Survey	Feb 28, 2025	In-Context LearningMisconceptions	—Unverified
Crowdsourcing the Perception of Machine Teaching	Feb 5, 2020	DiversityMisconceptions	—Unverified
Clarifying System 1 & 2 through the Common Model of Cognition	May 18, 2023	Misconceptions	—Unverified
Enforcing Interpretability and its Statistical Impacts: Trade-offs between Accuracy and Interpretability	Oct 26, 2020	Binary ClassificationLearning Theory	—Unverified
Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias	Jan 26, 2024	Decision MakingDiagnostic	—Unverified
COVIDLies: Detecting COVID-19 Misinformation on Social Media	Dec 1, 2020	MisconceptionsMisinformation	—Unverified
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts	Oct 11, 2024	Holdout SetMisconceptions	—Unverified
Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning	Feb 8, 2021	MisconceptionsMulti-agent Reinforcement Learning	—Unverified
On Proximity and Structural Role-based Embeddings in Networks: Misconceptions, Techniques, and Applications	Aug 22, 2019	MisconceptionsNetwork Embedding	—Unverified
An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning	May 10, 2024	MisconceptionsMulti-agent Reinforcement Learning	—Unverified
Humans can learn to detect AI-generated texts, or at least learn when they can't	May 3, 2025	Misconceptions	—Unverified
Foundation Models in Computational Pathology: A Review of Challenges, Opportunities, and Impact	Feb 12, 2025	Misconceptions	—Unverified
Formalising Anti-Discrimination Law in Automated Decision Systems	Jun 29, 2024	FairnessLegal Reasoning	—Unverified
Axiomatic modeling of fixed proportion technologies	Apr 18, 2024	Misconceptions	—Unverified
Finnish 5th and 6th graders' misconceptions about Artificial Intelligence	Nov 28, 2023	Misconceptions	—Unverified
From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions	Apr 1, 2025	Misconceptions	—Unverified
From Random to Regular: Variation in the Patterning of Retinal Mosaics	Oct 22, 2019	DescriptiveDiagnostic	—Unverified
Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills	Oct 5, 2024	Decision MakingMisconceptions	—Unverified
Fine-tuning Language Models for Factuality	Nov 14, 2023	Fact CheckingMisconceptions	—Unverified

Show:10 25 50

← PrevPage 3 of 7Next →

No leaderboard results yet.