SOTAVerified

General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Source: BIG-bench

Papers

Showing 351375 of 399 papers

TitleStatusHype
Test-Time Self-Adaptive Small Language Models for Question AnsweringCode0
Connecting a French Dictionary from the Beginning of the 20th Century to WikidataCode0
What Does My QA Model Know? Devising Controlled Probes using Expert KnowledgeCode0
Pruning neural network models for gene regulatory dynamics using data and domain knowledgeCode0
Effective Skill Unlearning through Intervention and AbstentionCode0
Learning to Learn Variational Semantic MemoryCode0
Domain Generalization via Model-Agnostic Learning of Semantic FeaturesCode0
Dive into the Resolution Augmentations and Metrics in Low Resolution Face Recognition: A Plain yet Effective New BaselineCode0
Comprehensive Fair Meta-learned Recommender SystemCode0
A Comparison of Prompt Engineering Techniques for Task Planning and Execution in Service RoboticsCode0
Knowledge graphs for empirical concept retrievalCode0
Should We Really Edit Language Models? On the Evaluation of Edited Language ModelsCode0
Knowledge Distillation for Detection Transformer with Consistent Distillation Points SamplingCode0
Joey NMT: A Minimalist NMT Toolkit for NovicesCode0
Distribution-aware Noisy-label Crack SegmentationCode0
Distilling Stereo Networks for Performant and Efficient Leaner NetworksCode0
Patching as Translation: the Data and the MetaphorCode0
PELMS: Pre-training for Effective Low-Shot Multi-Document SummarizationCode0
What Makes Cryptic Crosswords Challenging for LLMs?Code0
Are Large Language Models a Good Replacement of Taxonomies?Code0
Planning Safety Trajectories with Dual-Phase, Physics-Informed, and Transportation Knowledge-Driven Large Language ModelsCode0
Integrating Semantic Knowledge to Tackle Zero-shot Text ClassificationCode0
Commonsense Knowledge in Word Associations and ConceptNetCode0
Improving Personalized Search with Regularized Low-Rank Parameter UpdatesCode0
Towards Difficulty-Agnostic Efficient Transfer Learning for Vision-Language ModelsCode0
Show:102550
← PrevPage 15 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy94.3Unverified
2Gopher-280B (few-shot, k=5)Accuracy93.9Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy 85.7Unverified
4Gopher-280B (few-shot, k=5)Accuracy 84.8Unverified
5Gopher-280B (few-shot, k=5)Accuracy84.2Unverified
6Gopher-280B (few-shot, k=5)Accuracy 84.1Unverified
7Gopher-280B (few-shot, k=5)Accuracy 83.9Unverified
8Gopher-280B (few-shot, k=5)Accuracy83.3Unverified
9Gopher-280B (few-shot, k=5)Accuracy 81.8Unverified
10Gopher-280B (few-shot, k=5)Accuracy 81Unverified