SOTAVerified

General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Source: BIG-bench

Papers

Showing 261270 of 399 papers

TitleStatusHype
Test-Time Self-Adaptive Small Language Models for Question AnsweringCode0
Motif-Based Prompt Learning for Universal Cross-Domain Recommendation0
Learning to Adapt SAM for Segmenting Cross-domain Point Clouds0
Dobby: A Conversational Service Robot Driven by GPT-40
Profit: Benchmarking Personalization and Robustness Trade-off in Federated Prompt Tuning0
Key Factors Affecting European Reactions to AI in European Full and Flawed Democracies0
Assessing Look-Ahead Bias in Stock Return Predictions Generated By GPT Sentiment Analysis0
Leveraging Large Language Models for Automated Dialogue AnalysisCode0
Learning to Model the World with Language0
Multilingual Tourist Assistance using ChatGPT: Comparing Capabilities in Hindi, Telugu, and Kannada0
Show:102550
← PrevPage 27 of 40Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy94.3Unverified
2Gopher-280B (few-shot, k=5)Accuracy93.9Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy 85.7Unverified
4Gopher-280B (few-shot, k=5)Accuracy 84.8Unverified
5Gopher-280B (few-shot, k=5)Accuracy84.2Unverified
6Gopher-280B (few-shot, k=5)Accuracy 84.1Unverified
7Gopher-280B (few-shot, k=5)Accuracy 83.9Unverified
8Gopher-280B (few-shot, k=5)Accuracy83.3Unverified
9Gopher-280B (few-shot, k=5)Accuracy 81.8Unverified
10Gopher-280B (few-shot, k=5)Accuracy 81Unverified