Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 161 papers

Title	Date	Tasks	Status
Emergent Abilities in Large Language Models: A Survey	Feb 28, 2025	In-Context LearningMisconceptions	—Unverified
Enforcing Interpretability and its Statistical Impacts: Trade-offs between Accuracy and Interpretability	Oct 26, 2020	Binary ClassificationLearning Theory	—Unverified
Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias	Jan 26, 2024	Decision MakingDiagnostic	—Unverified
Automatic Generation of Question Hints for Mathematics Problems using Large Language Models in Educational Technology	Nov 5, 2024	MathMisconceptions	—Unverified
Knowledge, beliefs, attitudes and perceived risk about COVID-19 vaccine and determinants of COVID-19 vaccine acceptance in Bangladesh	Mar 28, 2021	Misconceptionsregression	—Unverified
Fine-tuning Language Models for Factuality	Nov 14, 2023	Fact CheckingMisconceptions	—Unverified
Finnish 5th and 6th graders' misconceptions about Artificial Intelligence	Nov 28, 2023	Misconceptions	—Unverified
Formalising Anti-Discrimination Law in Automated Decision Systems	Jun 29, 2024	FairnessLegal Reasoning	—Unverified
Foundation Models in Computational Pathology: A Review of Challenges, Opportunities, and Impact	Feb 12, 2025	Misconceptions	—Unverified
On Proximity and Structural Role-based Embeddings in Networks: Misconceptions, Techniques, and Applications	Aug 22, 2019	MisconceptionsNetwork Embedding	—Unverified
From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions	Apr 1, 2025	Misconceptions	—Unverified
From Random to Regular: Variation in the Patterning of Retinal Mosaics	Oct 22, 2019	DescriptiveDiagnostic	—Unverified
A close-up comparison of the misclassification error distance and the adjusted Rand index for external clustering evaluation	Jul 26, 2019	ClusteringMisconceptions	—Unverified
Generating Plausible Distractors for Multiple-Choice Questions via Student Choice Prediction	Jan 21, 2025	Distractor GenerationMisconceptions	—Unverified
Generative AI in Education: From Foundational Insights to the Socratic Playground for Learning	Jan 12, 2025	Misconceptions	—Unverified
Axiomatic modeling of fixed proportion technologies	Apr 18, 2024	Misconceptions	—Unverified
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts	Oct 11, 2024	Holdout SetMisconceptions	—Unverified
Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing	Apr 20, 2021	EthicsMisconceptions	—Unverified
How Useful are Gradients for OOD Detection Really?	May 20, 2022	Computational EfficiencyMisconceptions	—Unverified
Human-centered trust framework: An HCI perspective	May 5, 2023	Misconceptions	—Unverified
Humans can learn to detect AI-generated texts, or at least learn when they can't	May 3, 2025	Misconceptions	—Unverified
Identifying science concepts and student misconceptions in an interactive essay writing tutor	Jun 1, 2012	Information RetrievalMisconceptions	—Unverified
Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank	Apr 19, 2024	Distractor GenerationMath	—Unverified
Improving Unsupervised Video Object Segmentation with Motion-Appearance Synergy	Dec 17, 2022	MisconceptionsObject	—Unverified
Justices for Information Bottleneck Theory	May 19, 2023	Misconceptions	—Unverified
Knowledge Tracing in Programming Education Integrating Students' Questions	Jan 22, 2025	Knowledge TracingMisconceptions	—Unverified
Laplace Redux - Effortless Bayesian Deep Learning	May 21, 2021	Deep LearningMisconceptions	—Unverified
Biometric recognition: why not massively adopted yet?	Feb 23, 2022	Misconceptions	—Unverified
Learnable: Theory vs Applications	Jul 27, 2018	BIG-bench Machine LearningMisconceptions	—Unverified
Breaking Boundaries: A Chronology with Future Directions of Women in Exercise Physiology Research, Centred on Pregnancy	Apr 12, 2024	Misconceptions	—Unverified
Limitations of Deep Neural Networks: a discussion of G. Marcus' critical appraisal of deep learning	Dec 22, 2020	Autonomous VehiclesDeep Learning	—Unverified
Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation	Oct 8, 2024	Dialogue GenerationHallucination	—Unverified
LLM Library Learning Fails: A LEGO-Prover Case Study	Apr 3, 2025	Mathematical ReasoningMisconceptions	—Unverified
Machine Learning Students Overfit to Overfitting	Sep 7, 2022	Misconceptions	—Unverified
Can a Hallucinating Model help in Reducing Human "Hallucination"?	May 1, 2024	HallucinationLogical Fallacies	—Unverified
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration	May 1, 2024	Language ModelingLanguage Modelling	—Unverified
Metagenomic Analysis using Phylogenetic Placement -- A Review of the First Decade	Feb 7, 2022	Misconceptions	—Unverified
A Graphical Approach to State Variable Selection in Off-policy Learning	Jan 1, 2025	Causal InferenceDimensionality Reduction	—Unverified
Neural topology optimization: the good, the bad, and the ugly	Jul 19, 2024	GPUMisconceptions	—Unverified
Challenges and Trends in User Trust Discourse in AI	May 5, 2023	Misconceptions	—Unverified
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions	Oct 3, 2023	MathMathematical Reasoning	—Unverified
On the lifting and reconstruction of nonlinear systems with multiple invariant sets	Apr 24, 2023	Misconceptions	—Unverified
Characterizing Information Seeking Events in Health-Related Social Discourse	Aug 17, 2023	Misconceptions	—Unverified
Problems in AI, their roots in philosophy, and implications for science and society	Jul 22, 2024	MisconceptionsPhilosophy	—Unverified
Prompting the E-Brushes: Users as Authors in Generative AI	Mar 25, 2024	Misconceptions	—Unverified
Quantum Technology for Economists	Dec 8, 2020	Misconceptions	—Unverified
Rectified Max-Value Entropy Search for Bayesian Optimization	Feb 28, 2022	Bayesian OptimizationMisconceptions	—Unverified
Refining Skewed Perceptions in Vision-Language Models through Visual Representations	May 22, 2024	Misconceptions	—Unverified
Student Data Paradox and Curious Case of Single Student-Tutor Model: Regressive Side Effects of Training LLMs for Personalized Learning	Apr 23, 2024	ARCCommon Sense Reasoning	—Unverified
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation	Dec 14, 2024	Misconceptions	—Unverified

Show:10 25 50

← PrevPage 2 of 4Next →

No leaderboard results yet.