SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 76–100 of 161 papers

Title	Date	Tasks	Status
Knowledge Tracing in Programming Education Integrating Students' Questions	Jan 22, 2025	Knowledge TracingMisconceptions	—Unverified
Laplace Redux - Effortless Bayesian Deep Learning	May 21, 2021	Deep LearningMisconceptions	—Unverified
Biometric recognition: why not massively adopted yet?	Feb 23, 2022	Misconceptions	—Unverified
Learnable: Theory vs Applications	Jul 27, 2018	BIG-bench Machine LearningMisconceptions	—Unverified
Breaking Boundaries: A Chronology with Future Directions of Women in Exercise Physiology Research, Centred on Pregnancy	Apr 12, 2024	Misconceptions	—Unverified
Limitations of Deep Neural Networks: a discussion of G. Marcus' critical appraisal of deep learning	Dec 22, 2020	Autonomous VehiclesDeep Learning	—Unverified
Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation	Oct 8, 2024	Dialogue GenerationHallucination	—Unverified
LLM Library Learning Fails: A LEGO-Prover Case Study	Apr 3, 2025	Mathematical ReasoningMisconceptions	—Unverified
Machine Learning Students Overfit to Overfitting	Sep 7, 2022	Misconceptions	—Unverified
Can a Hallucinating Model help in Reducing Human "Hallucination"?	May 1, 2024	HallucinationLogical Fallacies	—Unverified
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration	May 1, 2024	Language ModelingLanguage Modelling	—Unverified
Metagenomic Analysis using Phylogenetic Placement -- A Review of the First Decade	Feb 7, 2022	Misconceptions	—Unverified
A Graphical Approach to State Variable Selection in Off-policy Learning	Jan 1, 2025	Causal InferenceDimensionality Reduction	—Unverified
Neural topology optimization: the good, the bad, and the ugly	Jul 19, 2024	GPUMisconceptions	—Unverified
Challenges and Trends in User Trust Discourse in AI	May 5, 2023	Misconceptions	—Unverified
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions	Oct 3, 2023	MathMathematical Reasoning	—Unverified
On the lifting and reconstruction of nonlinear systems with multiple invariant sets	Apr 24, 2023	Misconceptions	—Unverified
Characterizing Information Seeking Events in Health-Related Social Discourse	Aug 17, 2023	Misconceptions	—Unverified
Problems in AI, their roots in philosophy, and implications for science and society	Jul 22, 2024	MisconceptionsPhilosophy	—Unverified
Prompting the E-Brushes: Users as Authors in Generative AI	Mar 25, 2024	Misconceptions	—Unverified
Quantum Technology for Economists	Dec 8, 2020	Misconceptions	—Unverified
Rectified Max-Value Entropy Search for Bayesian Optimization	Feb 28, 2022	Bayesian OptimizationMisconceptions	—Unverified
Refining Skewed Perceptions in Vision-Language Models through Visual Representations	May 22, 2024	Misconceptions	—Unverified
Student Data Paradox and Curious Case of Single Student-Tutor Model: Regressive Side Effects of Training LLMs for Personalized Learning	Apr 23, 2024	ARCCommon Sense Reasoning	—Unverified
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation	Dec 14, 2024	Misconceptions	—Unverified

Show:10 25 50

← PrevPage 4 of 7Next →

No leaderboard results yet.