SOTAVerified|Agents Browse Leaderboard About Blog

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 161 papers

Title	Date	Tasks	Status	Hype
Developer Perspectives on Licensing and Copyright Issues Arising from Generative AI for Software Development	Nov 16, 2024	MisconceptionsSurvey	—Unverified	0
Automatic Generation of Question Hints for Mathematics Problems using Large Language Models in Educational Technology	Nov 5, 2024	MathMisconceptions	—Unverified	0
A Study on Characterization of Near-Field Sub-Regions For Phased-Array Antennas	Oct 23, 2024	Misconceptions	—Unverified	0
LLM-based Cognitive Models of Students with Misconceptions	Oct 16, 2024	Misconceptions	—Unverified	0
The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models	Oct 12, 2024	MisconceptionsMultiple-choice	—Unverified	0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts	Oct 11, 2024	Holdout SetMisconceptions	—Unverified	0
Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation	Oct 8, 2024	Dialogue GenerationHallucination	—Unverified	0
Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills	Oct 5, 2024	Decision MakingMisconceptions	—Unverified	0
A Thematic Framework for Analyzing Large-scale Self-reported Social Media Data on Opioid Use Disorder Treatment Using Buprenorphine Product	Oct 2, 2024	Misconceptions	—Unverified	0
Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs	Sep 24, 2024	Knowledge TracingMisconceptions	CodeCode Available	1
Enhancing Knowledge Tracing with Concept Map and Response Disentanglement	Aug 23, 2024	DisentanglementKnowledge Tracing	CodeCode Available	1
Classifier-Free Guidance is a Predictor-Corrector	Aug 16, 2024	DenoisingMisconceptions	—Unverified	0
Problems in AI, their roots in philosophy, and implications for science and society	Jul 22, 2024	MisconceptionsPhilosophy	—Unverified	0
Neural topology optimization: the good, the bad, and the ugly	Jul 19, 2024	GPUMisconceptions	—Unverified	0
When big data actually are low-rank, or entrywise approximation of certain function-generated matrices	Jul 3, 2024	Misconceptions	CodeCode Available	0
MalAlgoQA: Pedagogical Evaluation of Counterfactual Reasoning in Large Language Models and Implications for AI in Education	Jul 1, 2024	counterfactualCounterfactual Reasoning	CodeCode Available	0
Formalising Anti-Discrimination Law in Automated Decision Systems	Jun 29, 2024	FairnessLegal Reasoning	—Unverified	0
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions	Jun 27, 2024	Distractor GenerationMath	CodeCode Available	0
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning	May 30, 2024	MisconceptionsMultiple-choice	CodeCode Available	0
Refining Skewed Perceptions in Vision-Language Models through Visual Representations	May 22, 2024	Misconceptions	—Unverified	0
An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning	May 10, 2024	MisconceptionsMulti-agent Reinforcement Learning	—Unverified	0
Toward In-Context Teaching: Adapting Examples to Students' Misconceptions	May 7, 2024	Misconceptions	—Unverified	0
Common pitfalls to avoid while using multiobjective optimization in machine learning	May 2, 2024	Evolutionary AlgorithmsMisconceptions	—Unverified	0
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration	May 1, 2024	Language ModelingLanguage Modelling	—Unverified	0
Can a Hallucinating Model help in Reducing Human "Hallucination"?	May 1, 2024	HallucinationLogical Fallacies	—Unverified	0

Show:10 25 50

← PrevPage 2 of 7Next →

No leaderboard results yet.