SOTAVerified

General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Source: BIG-bench

Papers

Showing 171180 of 399 papers

TitleStatusHype
Extending TWIG: Zero-Shot Predictive Hyperparameter Selection for KGEs based on Graph Structure0
Are Longer Prompts Always Better? Prompt Selection in Large Language Models for Recommendation Systems0
What Makes Cryptic Crosswords Challenging for LLMs?Code0
MoSLD: An Extremely Parameter-Efficient Mixture-of-Shared LoRAs for Multi-Task Learning0
TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation0
Adapter-based Approaches to Knowledge-enhanced Language Models -- A Survey0
GOT4Rec: Graph of Thoughts for Sequential Recommendation0
GRL-Prompt: Towards Knowledge Graph based Prompt Optimization via Reinforcement Learning0
Efficient Transfer Learning for Video-language Foundation ModelsCode0
MM-Eval: A Hierarchical Benchmark for Modern Mongolian Evaluation in LLMsCode0
Show:102550
← PrevPage 18 of 40Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy94.3Unverified
2Gopher-280B (few-shot, k=5)Accuracy93.9Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy 85.7Unverified
4Gopher-280B (few-shot, k=5)Accuracy 84.8Unverified
5Gopher-280B (few-shot, k=5)Accuracy84.2Unverified
6Gopher-280B (few-shot, k=5)Accuracy 84.1Unverified
7Gopher-280B (few-shot, k=5)Accuracy 83.9Unverified
8Gopher-280B (few-shot, k=5)Accuracy83.3Unverified
9Gopher-280B (few-shot, k=5)Accuracy 81.8Unverified
10Gopher-280B (few-shot, k=5)Accuracy 81Unverified