SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 161 papers

Title	Date	Tasks	Status	Hype
The Oversmoothing Fallacy: A Misguided Narrative in GNN Research	Jun 5, 2025	Misconceptions	—Unverified	0
A Structured Unplugged Approach for Foundational AI Literacy in Primary Education	May 27, 2025	Logical ReasoningMisconceptions	CodeCode Available	0
When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research	May 17, 2025	Misconceptionsscientific discovery	CodeCode Available	0
Automated Identification of Logical Errors in Programs: Advancing Scalable Analysis of Student Misconceptions	May 16, 2025	Misconceptions	—Unverified	0
Humans can learn to detect AI-generated texts, or at least learn when they can't	May 3, 2025	Misconceptions	—Unverified	0
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors	May 2, 2025	High School PhysicsMisconceptions	CodeCode Available	0
Unveiling Contrastive Learning's Capability of Neighborhood Aggregation for Collaborative Filtering	Apr 14, 2025	Collaborative FilteringContrastive Learning	CodeCode Available	1
LLM Library Learning Fails: A LEGO-Prover Case Study	Apr 3, 2025	Mathematical ReasoningMisconceptions	—Unverified	0
What is AI, what is it not, how we use it in physics and how it impacts... you	Apr 2, 2025	Anomaly DetectionMisconceptions	—Unverified	0
From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions	Apr 1, 2025	Misconceptions	—Unverified	0
Clarifying Misconceptions in COVID-19 Vaccine Sentiment and Stance Analysis and Their Implications for Vaccine Hesitancy Mitigation: A Systematic Review	Mar 23, 2025	MisconceptionsSentiment Analysis	—Unverified	0
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation	Mar 12, 2025	counterfactualMisconceptions	CodeCode Available	0
Paths and Ambient Spaces in Neural Loss Landscapes	Mar 5, 2025	Misconceptions	CodeCode Available	0
Emergent Abilities in Large Language Models: A Survey	Feb 28, 2025	In-Context LearningMisconceptions	—Unverified	0
Analyzing Factors Influencing Driver Willingness to Accept Advanced Driver Assistance Systems	Feb 23, 2025	Misconceptions	—Unverified	0
The Imitation Game for Educational AI	Feb 21, 2025	Distractor GenerationMisconceptions	—Unverified	0
Retrieval-augmented systems can be dangerous medical communicators	Feb 18, 2025	MisconceptionsRetrieval	—Unverified	0
Foundation Models in Computational Pathology: A Review of Challenges, Opportunities, and Impact	Feb 12, 2025	Misconceptions	—Unverified	0
Knowledge Tracing in Programming Education Integrating Students' Questions	Jan 22, 2025	Knowledge TracingMisconceptions	—Unverified	0
Generating Plausible Distractors for Multiple-Choice Questions via Student Choice Prediction	Jan 21, 2025	Distractor GenerationMisconceptions	—Unverified	0
Generative AI in Education: From Foundational Insights to the Socratic Playground for Learning	Jan 12, 2025	Misconceptions	—Unverified	0
Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension	Jan 2, 2025	Misconceptions	—Unverified	0
A Graphical Approach to State Variable Selection in Off-policy Learning	Jan 1, 2025	Causal InferenceDimensionality Reduction	—Unverified	0
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation	Dec 14, 2024	Misconceptions	—Unverified	0
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor	Dec 8, 2024	MisconceptionsMultiple-choice	CodeCode Available	0

Show:10 25 50

← PrevPage 1 of 7Next →

No leaderboard results yet.