SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 161 papers

Title	Date	Tasks	Status
Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation	Oct 8, 2024	Dialogue GenerationHallucination	—Unverified
Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills	Oct 5, 2024	Decision MakingMisconceptions	—Unverified
A Thematic Framework for Analyzing Large-scale Self-reported Social Media Data on Opioid Use Disorder Treatment Using Buprenorphine Product	Oct 2, 2024	Misconceptions	—Unverified
Classifier-Free Guidance is a Predictor-Corrector	Aug 16, 2024	DenoisingMisconceptions	—Unverified
Problems in AI, their roots in philosophy, and implications for science and society	Jul 22, 2024	MisconceptionsPhilosophy	—Unverified
Neural topology optimization: the good, the bad, and the ugly	Jul 19, 2024	GPUMisconceptions	—Unverified
When big data actually are low-rank, or entrywise approximation of certain function-generated matrices	Jul 3, 2024	Misconceptions	CodeCode Available
MalAlgoQA: Pedagogical Evaluation of Counterfactual Reasoning in Large Language Models and Implications for AI in Education	Jul 1, 2024	counterfactualCounterfactual Reasoning	CodeCode Available
Formalising Anti-Discrimination Law in Automated Decision Systems	Jun 29, 2024	FairnessLegal Reasoning	—Unverified
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions	Jun 27, 2024	Distractor GenerationMath	CodeCode Available
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning	May 30, 2024	MisconceptionsMultiple-choice	CodeCode Available
Refining Skewed Perceptions in Vision-Language Models through Visual Representations	May 22, 2024	Misconceptions	—Unverified
An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning	May 10, 2024	MisconceptionsMulti-agent Reinforcement Learning	—Unverified
Toward In-Context Teaching: Adapting Examples to Students' Misconceptions	May 7, 2024	Misconceptions	—Unverified
Common pitfalls to avoid while using multiobjective optimization in machine learning	May 2, 2024	Evolutionary AlgorithmsMisconceptions	—Unverified
Can a Hallucinating Model help in Reducing Human "Hallucination"?	May 1, 2024	HallucinationLogical Fallacies	—Unverified
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration	May 1, 2024	Language ModelingLanguage Modelling	—Unverified
A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice	Apr 25, 2024	Misconceptions	CodeCode Available
Student Data Paradox and Curious Case of Single Student-Tutor Model: Regressive Side Effects of Training LLMs for Personalized Learning	Apr 23, 2024	ARCCommon Sense Reasoning	—Unverified
Differential contributions of machine learning and statistical analysis to language and cognitive sciences	Apr 22, 2024	Misconceptions	—Unverified
Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank	Apr 19, 2024	Distractor GenerationMath	—Unverified
Axiomatic modeling of fixed proportion technologies	Apr 18, 2024	Misconceptions	—Unverified
Breaking Boundaries: A Chronology with Future Directions of Women in Exercise Physiology Research, Centred on Pregnancy	Apr 12, 2024	Misconceptions	—Unverified
SoK: On Gradient Leakage in Federated Learning	Apr 8, 2024	Federated LearningMisconceptions	—Unverified
Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models	Apr 2, 2024	Distractor GenerationIn-Context Learning	CodeCode Available

Show:10 25 50

← PrevPage 3 of 7Next →

No leaderboard results yet.