SOTAVerified

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Showing 5175 of 161 papers

TitleStatusHype
A close-up comparison of the misclassification error distance and the adjusted Rand index for external clustering evaluation0
Disproving XAI Myths with Formal Methods -- Initial Results0
Distortions in Judged Spatial Relations in Large Language Models0
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation0
Automated Identification of Logical Errors in Programs: Advancing Scalable Analysis of Student Misconceptions0
Automatic Generation of Question Hints for Mathematics Problems using Large Language Models in Educational Technology0
Emergent Abilities in Large Language Models: A Survey0
Crowdsourcing the Perception of Machine Teaching0
Clarifying System 1 & 2 through the Common Model of Cognition0
Enforcing Interpretability and its Statistical Impacts: Trade-offs between Accuracy and Interpretability0
Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias0
COVIDLies: Detecting COVID-19 Misinformation on Social Media0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts0
Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning0
On Proximity and Structural Role-based Embeddings in Networks: Misconceptions, Techniques, and Applications0
An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning0
Humans can learn to detect AI-generated texts, or at least learn when they can't0
Foundation Models in Computational Pathology: A Review of Challenges, Opportunities, and Impact0
Formalising Anti-Discrimination Law in Automated Decision Systems0
Axiomatic modeling of fixed proportion technologies0
Finnish 5th and 6th graders' misconceptions about Artificial Intelligence0
From Intuition to Understanding: Using AI Peers to Overcome Physics Misconceptions0
From Random to Regular: Variation in the Patterning of Retinal Mosaics0
Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills0
Fine-tuning Language Models for Factuality0
Show:102550
← PrevPage 3 of 7Next →

No leaderboard results yet.