SOTAVerified

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Showing 125 of 161 papers

TitleStatusHype
The Oversmoothing Fallacy: A Misguided Narrative in GNN Research0
A Structured Unplugged Approach for Foundational AI Literacy in Primary EducationCode0
When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific ResearchCode0
Automated Identification of Logical Errors in Programs: Advancing Scalable Analysis of Student Misconceptions0
Humans can learn to detect AI-generated texts, or at least learn when they can't0
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective DistractorsCode0
Unveiling Contrastive Learning's Capability of Neighborhood Aggregation for Collaborative FilteringCode1
LLM Library Learning Fails: A LEGO-Prover Case Study0
What is AI, what is it not, how we use it in physics and how it impacts... you0
From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions0
Clarifying Misconceptions in COVID-19 Vaccine Sentiment and Stance Analysis and Their Implications for Vaccine Hesitancy Mitigation: A Systematic Review0
How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit MisinformationCode0
Paths and Ambient Spaces in Neural Loss LandscapesCode0
Emergent Abilities in Large Language Models: A Survey0
Analyzing Factors Influencing Driver Willingness to Accept Advanced Driver Assistance Systems0
The Imitation Game for Educational AI0
Retrieval-augmented systems can be dangerous medical communicators0
Foundation Models in Computational Pathology: A Review of Challenges, Opportunities, and Impact0
Knowledge Tracing in Programming Education Integrating Students' Questions0
Generating Plausible Distractors for Multiple-Choice Questions via Student Choice Prediction0
Generative AI in Education: From Foundational Insights to the Socratic Playground for Learning0
Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension0
A Graphical Approach to State Variable Selection in Off-policy Learning0
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation0
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning DistractorCode0
Show:102550
← PrevPage 1 of 7Next →

No leaderboard results yet.