SOTAVerified

General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Source: BIG-bench

Papers

Showing 326350 of 399 papers

TitleStatusHype
One to Many: Adaptive Instrument Segmentation via Meta Learning and Dynamic Online Adaptation in Robotic Surgical Video0
Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to DoCode1
Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation0
Show, Attend and Distill:Knowledge Distillation via Attention-based Feature MatchingCode1
Towards Knowledge-Augmented Visual Question AnsweringCode0
Transfer learning of chaotic systems0
Tencent AI Lab Machine Translation Systems for WMT20 Chat Translation Task0
Learning Physical Common Sense as Knowledge Graph Completion via BERT Data Augmentation and Constrained Tucker Factorization0
Learning to Learn Variational Semantic MemoryCode0
Unsupervised Neural Machine Translation for Low-Resource Domains via Meta-Learning0
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text GenerationCode1
Patching as Translation: the Data and the MetaphorCode0
An Energy Ontology for Global City Indicators (ISO 37120)0
Domain Specific, Semi-Supervised Transfer Learning for Medical Imaging0
Transformers as Soft Reasoners over LanguageCode1
What's a Good Prediction? Challenges in evaluating an agent's knowledge0
What Does My QA Model Know? Devising Controlled Probes using Expert KnowledgeCode0
Acquiring Knowledge from Pre-trained Model to Neural Machine Translation0
Go From the General to the Particular: Multi-Domain Translation with Domain Transformation NetworksCode1
Joint Embedding Learning of Educational Knowledge Graphs0
Improving Multi-label Emotion Classification by Integrating both General and Domain-specific Knowledge0
Domain Generalization via Model-Agnostic Learning of Semantic FeaturesCode0
Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System0
Spoken Conversational Search for General Knowledge0
A Human-Centered Data-Driven Planner-Actor-Critic Architecture via Logic Programming0
Show:102550
← PrevPage 14 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy94.3Unverified
2Gopher-280B (few-shot, k=5)Accuracy93.9Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy 85.7Unverified
4Gopher-280B (few-shot, k=5)Accuracy 84.8Unverified
5Gopher-280B (few-shot, k=5)Accuracy84.2Unverified
6Gopher-280B (few-shot, k=5)Accuracy 84.1Unverified
7Gopher-280B (few-shot, k=5)Accuracy 83.9Unverified
8Gopher-280B (few-shot, k=5)Accuracy83.3Unverified
9Gopher-280B (few-shot, k=5)Accuracy 81.8Unverified
10Gopher-280B (few-shot, k=5)Accuracy 81Unverified