SOTAVerified

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Showing 91100 of 161 papers

TitleStatusHype
Knowledge Tracing in Programming Education Integrating Students' Questions0
Benchmark Inflation: Revealing LLM Performance Gaps Using Retro-Holdouts0
Laplace Redux - Effortless Bayesian Deep Learning0
A close-up comparison of the misclassification error distance and the adjusted Rand index for external clustering evaluation0
Learnable: Theory vs Applications0
A clarification of misconceptions, myths and desired status of artificial intelligence0
Limitations of Deep Neural Networks: a discussion of G. Marcus' critical appraisal of deep learning0
Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation0
LLM Library Learning Fails: A LEGO-Prover Case Study0
Machine Learning Students Overfit to Overfitting0
Show:102550
← PrevPage 10 of 17Next →

No leaderboard results yet.