SOTAVerified

General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Source: BIG-bench

Papers

Showing 111120 of 399 papers

TitleStatusHype
Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud LearningCode3
Composite Learning Units: Generalized Learning Beyond Parameter Updates to Transform LLMs into Adaptive Reasoners0
Selective Aggregation for Low-Rank Adaptation in Federated LearningCode2
Cascade Prompt Learning for Vision-Language Model AdaptationCode3
What Would You Ask When You First Saw a^2+b^2=c^2? Evaluating LLM on Curiosity-Driven Questioning0
Bilingual Evaluation of Language Models on General Knowledge in University Entrance Exams with Minimal Contamination0
E2Map: Experience-and-Emotion Map for Self-Reflective Robot Navigation with Language ModelsCode1
Can LVLMs Obtain a Driver's License? A Benchmark Towards Reliable AGI for Autonomous Driving0
Aligning Medical Images with General Knowledge from Large Language ModelsCode1
How Well Do LLMs Handle Cantonese? Benchmarking Cantonese Capabilities of Large Language ModelsCode1
Show:102550
← PrevPage 12 of 40Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy94.3Unverified
2Gopher-280B (few-shot, k=5)Accuracy93.9Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy 85.7Unverified
4Gopher-280B (few-shot, k=5)Accuracy 84.8Unverified
5Gopher-280B (few-shot, k=5)Accuracy84.2Unverified
6Gopher-280B (few-shot, k=5)Accuracy 84.1Unverified
7Gopher-280B (few-shot, k=5)Accuracy 83.9Unverified
8Gopher-280B (few-shot, k=5)Accuracy83.3Unverified
9Gopher-280B (few-shot, k=5)Accuracy 81.8Unverified
10Gopher-280B (few-shot, k=5)Accuracy 81Unverified