SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 76–100 of 161 papers

Title	Date	Tasks	Status
Prompting the E-Brushes: Users as Authors in Generative AI	Mar 25, 2024	Misconceptions	—Unverified
WatChat: Explaining perplexing programs by debugging mental models	Mar 8, 2024	counterfactualLanguage Modelling	CodeCode Available
Clarify: Improving Model Robustness With Natural Language Corrections	Feb 6, 2024	Misconceptionsmodel	CodeCode Available
The Essential Role of Causality in Foundation World Models for Embodied AI	Feb 6, 2024	Misconceptions	—Unverified
Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias	Jan 26, 2024	Decision MakingDiagnostic	—Unverified
Distortions in Judged Spatial Relations in Large Language Models	Jan 8, 2024	MisconceptionsSpatial Reasoning	—Unverified
Finnish 5th and 6th graders' misconceptions about Artificial Intelligence	Nov 28, 2023	Misconceptions	—Unverified
Uncertainty Quantification in Machine Learning for Biosignal Applications -- A Review	Nov 15, 2023	DiagnosticEEG	—Unverified
Fine-tuning Language Models for Factuality	Nov 14, 2023	Fact CheckingMisconceptions	—Unverified
Large Language Models for In-Context Student Modeling: Synthesizing Student's Behavior in Visual Programming	Oct 15, 2023	Misconceptions	CodeCode Available
Unraveling the Single Tangent Space Fallacy: An Analysis and Clarification for Applying Riemannian Geometry in Robot Learning	Oct 11, 2023	Misconceptions	—Unverified
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions	Oct 3, 2023	MathMathematical Reasoning	—Unverified
Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute Misconceptions	Oct 3, 2023	MisconceptionsMultiple-choice	CodeCode Available
EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning	Sep 16, 2023	Date UnderstandingGSM8K	CodeCode Available
Towards a Rigorous Analysis of Mutual Information in Contrastive Learning	Aug 30, 2023	Contrastive LearningMisconceptions	—Unverified
Using language models in the implicit automated assessment of mathematical short answer items	Aug 21, 2023	Misconceptions	—Unverified
Characterizing Information Seeking Events in Health-Related Social Discourse	Aug 17, 2023	Misconceptions	—Unverified
Automated Distractor and Feedback Generation for Math Multiple-choice Questions via In-context Learning	Aug 7, 2023	In-Context LearningMath	CodeCode Available
Scalable and Equitable Math Problem Solving Strategy Prediction in Big Educational Data	Aug 7, 2023	MathMisconceptions	CodeCode Available
Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics and Prompt Wording	Jun 9, 2023	Misconceptions	CodeCode Available
Dear XAI Community, We Need to Talk! Fundamental Misconceptions in Current XAI Research	Jun 7, 2023	Explainable Artificial Intelligence (XAI)Misconceptions	—Unverified
Justices for Information Bottleneck Theory	May 19, 2023	Misconceptions	—Unverified
Clarifying System 1 & 2 through the Common Model of Cognition	May 18, 2023	Misconceptions	—Unverified
Disproving XAI Myths with Formal Methods -- Initial Results	May 13, 2023	Explainable artificial intelligenceExplainable Artificial Intelligence (XAI)	—Unverified
Human-centered trust framework: An HCI perspective	May 5, 2023	Misconceptions	—Unverified

Show:10 25 50

← PrevPage 4 of 7Next →

No leaderboard results yet.