SOTAVerified

Misconceptions

Measures whether a model can discern popular misconceptions from the truth.

Example:

        input: The daddy longlegs spider is the most venomous spider in the world.
        choice: T
        choice: F
        answer: F

        input: Karl Benz is correctly credited with the invention of the first modern automobile.
        choice: T
        choice: F
        answer: T

Source: BIG-bench

Papers

Showing 101150 of 161 papers

TitleStatusHype
Factuality Enhanced Language Models for Open-Ended Text GenerationCode5
How Useful are Gradients for OOD Detection Really?0
A Weakly-Supervised Iterative Graph-Based Approach to Retrieve COVID-19 Misinformation TopicsCode0
From Solution Synthesis to Student Attempt Synthesis for Block-Based Visual Programming TasksCode0
Towards Process-Oriented, Modular, and Versatile Question Generation that Meets Educational NeedsCode1
Training Compute-Optimal Large Language ModelsCode6
Rectified Max-Value Entropy Search for Bayesian Optimization0
Biometric recognition: why not massively adopted yet?0
Metagenomic Analysis using Phylogenetic Placement -- A Review of the First Decade0
Resolving conceptual issues in Modern Coexistence TheoryCode0
Big Data is not the New Oil: Common Misconceptions about Population Data0
Scaling Language Models: Methods, Analysis & Insights from Training GopherCode2
Demystifying Ten Big Ideas and Rules Every Fire Scientist & Engineer Should Know About Blackbox, Whitebox & Causal Artificial Intelligence0
End-to-End Annotator Bias Approximation on Crowdsourced Single-Label Sentiment AnalysisCode0
TruthfulQA: Measuring How Models Mimic Human FalsehoodsCode1
Back to the Drawing Board: A Critical Evaluation of Poisoning Attacks on Production Federated LearningCode1
Laplace Redux -- Effortless Bayesian Deep LearningCode1
The kernel perspective on dynamic mode decomposition0
Laplace Redux - Effortless Bayesian Deep Learning0
Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing0
Pay attention to your loss: understanding misconceptions about 1-Lipschitz neural networksCode0
Knowledge, beliefs, attitudes and perceived risk about COVID-19 vaccine and determinants of COVID-19 vaccine acceptance in Bangladesh0
Deep Discourse Analysis for Generating Personalized Feedback in Intelligent Tutor Systems0
Toward Semi-Automatic Misconception Discovery Using Code Embeddings0
Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning0
Emergent Communication under CompetitionCode1
Limitations of Deep Neural Networks: a discussion of G. Marcus' critical appraisal of deep learning0
Hindsight and Sequential Rationality of Correlated PlayCode0
Quantum Technology for Economists0
COVIDLies: Detecting COVID-19 Misinformation on Social Media0
Depression Status Estimation by Deep Learning based Hybrid Multi-Modal Fusion Model0
Enforcing Interpretability and its Statistical Impacts: Trade-offs between Accuracy and Interpretability0
Collecting the Public Perception of AI and Robot RightsCode0
A clarification of misconceptions, myths and desired status of artificial intelligence0
Instructions and Guide for Diagnostic Questions: The NeurIPS 2020 Education Challenge0
A Tutorial on VAEs: From Bayes' Rule to Lossless CompressionCode1
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong BaselinesCode1
The Bussgang Decomposition of Non-Linear Systems: Basic Theory and MIMO Extensions0
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe0
Crowdsourcing the Perception of Machine Teaching0
Re-Examining Linear Embeddings for High-Dimensional Bayesian OptimizationCode1
Deep Curvature SuiteCode0
Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess HypothesesCode0
From Random to Regular: Variation in the Patterning of Retinal Mosaics0
Discounted Reinforcement Learning Is Not an Optimization Problem0
On Proximity and Structural Role-based Embeddings in Networks: Misconceptions, Techniques, and Applications0
A close-up comparison of the misclassification error distance and the adjusted Rand index for external clustering evaluation0
How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some MisconceptionsCode0
Zero Shot Learning for Code Education: Rubric Sampling with Deep Learning InferenceCode0
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering0
Show:102550
← PrevPage 3 of 4Next →

No leaderboard results yet.