SOTAVerified

General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Source: BIG-bench

Papers

Showing 321330 of 399 papers

TitleStatusHype
MoST: Multi-modality Scene Tokenization for Motion Prediction0
Motif-Based Prompt Learning for Universal Cross-Domain Recommendation0
Multilingual Tourist Assistance using ChatGPT: Comparing Capabilities in Hindi, Telugu, and Kannada0
Multi-task Federated Learning with Encoder-Decoder Structure: Enabling Collaborative Learning Across Different Tasks0
Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation0
Neural Discourse Relation Recognition with Semantic Memory0
Neural Regularized Domain Adaptation for Chinese Word Segmentation0
Shifted Autoencoders for Point Annotation Restoration in Object Counting0
Nudging: Inference-time Alignment of LLMs via Guided Decoding0
One to Many: Adaptive Instrument Segmentation via Meta Learning and Dynamic Online Adaptation in Robotic Surgical Video0
Show:102550
← PrevPage 33 of 40Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy94.3Unverified
2Gopher-280B (few-shot, k=5)Accuracy93.9Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy 85.7Unverified
4Gopher-280B (few-shot, k=5)Accuracy 84.8Unverified
5Gopher-280B (few-shot, k=5)Accuracy84.2Unverified
6Gopher-280B (few-shot, k=5)Accuracy 84.1Unverified
7Gopher-280B (few-shot, k=5)Accuracy 83.9Unverified
8Gopher-280B (few-shot, k=5)Accuracy83.3Unverified
9Gopher-280B (few-shot, k=5)Accuracy 81.8Unverified
10Gopher-280B (few-shot, k=5)Accuracy 81Unverified