General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 376–399 of 399 papers

Title	Date	Tasks	Status
HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models	Jun 4, 2025	BenchmarkingGeneral Knowledge	CodeCode Available
Can ChatGPT Enable ITS? The Case of Mixed Traffic Control via Reinforcement Learning	Jun 13, 2023	General KnowledgeManagement	CodeCode Available
How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities	Mar 20, 2025	General KnowledgeLanguage Modeling	CodeCode Available
Yuanfudao at SemEval-2018 Task 11: Three-way Attention and Relational Knowledge for Commonsense Machine Comprehension	Mar 1, 2018	General KnowledgeReading Comprehension	CodeCode Available
WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models	Jul 25, 2022	Common Sense ReasoningGeneral Knowledge	CodeCode Available
PROL : Rehearsal Free Continual Learning in Streaming Data via Prompt Online Learning	Jul 16, 2025	Continual LearningGeneral Knowledge	CodeCode Available
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks	Dec 7, 2022	General KnowledgeLanguage Modeling	CodeCode Available
Visual Question Answering: A Survey of Methods and Datasets	Jul 20, 2016	General KnowledgeSurvey	CodeCode Available
From Knowledge to Reasoning: Evaluating LLMs for Ionic Liquids Research in Chemical and Biological Engineering	May 11, 2025	BenchmarkingGeneral Knowledge	CodeCode Available
Quantized Prompt for Efficient Generalization of Vision-Language Models	Jul 15, 2024	General KnowledgeLanguage Modelling	CodeCode Available
World Knowledge in Multiple Choice Reading Comprehension	Nov 13, 2022	General KnowledgeMultiple-choice	CodeCode Available
Foundation X: Integrating Classification, Localization, and Segmentation through Lock-Release Pretraining Strategy for Chest X-ray Analysis	Mar 12, 2025	DiagnosticGeneral Knowledge	CodeCode Available
Fed-CO2: Cooperation of Online and Offline Models for Severe Data Heterogeneity in Federated Learning	Dec 21, 2023	Domain GeneralizationFederated Learning	CodeCode Available
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model	Oct 26, 2023	Data AugmentationGeneral Knowledge	CodeCode Available
Exploring Recommendation Capabilities of GPT-4V(ision): A Preliminary Case Study	Nov 7, 2023	General KnowledgeReading Comprehension	CodeCode Available
REFinD: Relation Extraction Financial Dataset	May 22, 2023	ArticlesGeneral Knowledge	CodeCode Available
Disentangling Fine-Tuning from Pre-Training in Visual Captioning with Hybrid Markov Logic	Mar 18, 2025	General KnowledgeImage Captioning	CodeCode Available
Exploiting Adapters for Cross-lingual Low-resource Speech Recognition	May 18, 2021	Cross-Lingual ASRGeneral Knowledge	CodeCode Available
RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content	Jun 17, 2024	BenchmarkingGeneral Knowledge	CodeCode Available
Towards Knowledge-Augmented Visual Question Answering	Dec 1, 2020	General KnowledgeGraph Attention	CodeCode Available
ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination	May 22, 2023	DiversityGeneral Knowledge	CodeCode Available
Evaluating Prompt-based Question Answering for Object Prediction in the Open Research Knowledge Graph	May 22, 2023	General KnowledgeQuestion Answering	CodeCode Available
DAGPrompT: Pushing the Limits of Graph Prompting with a Distribution-aware Graph Prompt Tuning Approach	Jan 25, 2025	General KnowledgeGraph Classification	CodeCode Available
Survey on Abstractive Text Summarization: Dataset, Models, and Metrics	Dec 22, 2024	Abstractive Text SummarizationGeneral Knowledge	CodeCode Available

Show:10 25 50

← PrevPage 16 of 16Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Chinchilla-70B (few-shot, k=5)	Accuracy	94.3	—	Unverified
2	Gopher-280B (few-shot, k=5)	Accuracy	93.9	—	Unverified
3	Chinchilla-70B (few-shot, k=5)	Accuracy	85.7	—	Unverified
4	Gopher-280B (few-shot, k=5)	Accuracy	84.8	—	Unverified
5	Gopher-280B (few-shot, k=5)	Accuracy	84.2	—	Unverified
6	Gopher-280B (few-shot, k=5)	Accuracy	84.1	—	Unverified
7	Gopher-280B (few-shot, k=5)	Accuracy	83.9	—	Unverified
8	Gopher-280B (few-shot, k=5)	Accuracy	83.3	—	Unverified
9	Gopher-280B (few-shot, k=5)	Accuracy	81.8	—	Unverified
10	Gopher-280B (few-shot, k=5)	Accuracy	81	—	Unverified