SOTAVerified|Agents Browse Leaderboard About

General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Source: BIG-bench

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 181–190 of 399 papers

Title	Date	Tasks	Status	Hype
K-Link: Knowledge-Link Graph from LLMs for Enhanced Representation Learning in Multivariate Time-Series Data	Mar 6, 2024	General Knowledgegraph construction	—Unverified	0
MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models	Mar 6, 2024	EthicsGeneral Knowledge	CodeCode Available	1
Pruning neural network models for gene regulatory dynamics using data and domain knowledge	Mar 5, 2024	General KnowledgeNetwork Pruning	CodeCode Available	0
Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation	Mar 4, 2024	Age And Gender ClassificationAge and Gender Estimation	CodeCode Available	3
Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese	Feb 27, 2024	General KnowledgeQuestion Answering	CodeCode Available	1
Bootstrapping Cognitive Agents with a Large Language Model	Feb 25, 2024	General KnowledgeLanguage Modeling	—Unverified	0
OMGEval: An Open Multilingual Generative Evaluation Benchmark for Large Language Models	Feb 21, 2024	General KnowledgeLogical Reasoning	CodeCode Available	1
Inductive Graph Alignment Prompt: Bridging the Gap between Graph Pre-training and Inductive Fine-tuning From Spectral Perspective	Feb 21, 2024	General KnowledgeGraph Classification	—Unverified	0
CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge	Feb 12, 2024	General KnowledgeMultiple-choice	CodeCode Available	2
Pre-training and Diagnosing Knowledge Base Completion Models	Jan 27, 2024	General KnowledgeKnowledge Base Completion	CodeCode Available	1

Show:10 25 50

← PrevPage 19 of 40Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Chinchilla-70B (few-shot, k=5)	Accuracy	94.3	—	Unverified
2	Gopher-280B (few-shot, k=5)	Accuracy	93.9	—	Unverified
3	Chinchilla-70B (few-shot, k=5)	Accuracy	85.7	—	Unverified
4	Gopher-280B (few-shot, k=5)	Accuracy	84.8	—	Unverified
5	Gopher-280B (few-shot, k=5)	Accuracy	84.2	—	Unverified
6	Gopher-280B (few-shot, k=5)	Accuracy	84.1	—	Unverified
7	Gopher-280B (few-shot, k=5)	Accuracy	83.9	—	Unverified
8	Gopher-280B (few-shot, k=5)	Accuracy	83.3	—	Unverified
9	Gopher-280B (few-shot, k=5)	Accuracy	81.8	—	Unverified
10	Gopher-280B (few-shot, k=5)	Accuracy	81	—	Unverified