SOTAVerified

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Showing 76100 of 161 papers

TitleStatusHype
Knowledge Tracing in Programming Education Integrating Students' Questions0
Laplace Redux - Effortless Bayesian Deep Learning0
Biometric recognition: why not massively adopted yet?0
Learnable: Theory vs Applications0
Breaking Boundaries: A Chronology with Future Directions of Women in Exercise Physiology Research, Centred on Pregnancy0
Limitations of Deep Neural Networks: a discussion of G. Marcus' critical appraisal of deep learning0
Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation0
LLM Library Learning Fails: A LEGO-Prover Case Study0
Machine Learning Students Overfit to Overfitting0
Can a Hallucinating Model help in Reducing Human "Hallucination"?0
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration0
Metagenomic Analysis using Phylogenetic Placement -- A Review of the First Decade0
A Graphical Approach to State Variable Selection in Off-policy Learning0
Neural topology optimization: the good, the bad, and the ugly0
Challenges and Trends in User Trust Discourse in AI0
Novice Learner and Expert Tutor: Evaluating Math Reasoning Abilities of Large Language Models with Misconceptions0
On the lifting and reconstruction of nonlinear systems with multiple invariant sets0
Characterizing Information Seeking Events in Health-Related Social Discourse0
Problems in AI, their roots in philosophy, and implications for science and society0
Prompting the E-Brushes: Users as Authors in Generative AI0
Quantum Technology for Economists0
Rectified Max-Value Entropy Search for Bayesian Optimization0
Refining Skewed Perceptions in Vision-Language Models through Visual Representations0
Student Data Paradox and Curious Case of Single Student-Tutor Model: Regressive Side Effects of Training LLMs for Personalized Learning0
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation0
Show:102550
← PrevPage 4 of 7Next →

No leaderboard results yet.