SOTAVerified

General Knowledge

This task aims to evaluate the ability of a model to answer general-knowledge questions.

Source: BIG-bench

Papers

Showing 201250 of 399 papers

TitleStatusHype
Dobby: A Conversational Service Robot Driven by GPT-40
Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models0
Domain Specific, Semi-Supervised Transfer Learning for Medical Imaging0
Dominance-based Rough Set Approach, basic ideas and main trends0
Efficient illumination angle self-calibration in Fourier ptychography0
Enabling Autonomic Microservice Management through Self-Learning Agents0
Enhance Graph Alignment for Large Language Models0
Enhancing Action Recognition from Low-Quality Skeleton Data via Part-Level Knowledge Distillation0
Enhancing Target-unspecific Tasks through a Features Matrix0
Evaluating Company-specific Biases in Financial Sentiment Analysis using Large Language Models0
Evaluating Consistency and Reasoning Capabilities of Large Language Models0
Evaluating Polish linguistic and cultural competency in large language models0
Evident: a Development Methodology and a Knowledge Base Topology for Data Mining, Machine Learning and General Knowledge Management0
Explainable Hierarchical Imitation Learning for Robotic Drink Pouring0
Exploit CAM by itself: Complementary Learning System for Weakly Supervised Semantic Segmentation0
Explicit Utilization of General Knowledge in Machine Reading Comprehension0
Exploring Safety-Utility Trade-Offs in Personalized Language Models0
Exploring Zero-Shot Anomaly Detection with CLIP in Medical Imaging: Are We There Yet?0
Extending TWIG: Zero-Shot Predictive Hyperparameter Selection for KGEs based on Graph Structure0
Extracting Unlearned Information from LLMs with Activation Steering0
Fast constrained sampling in pre-trained diffusion models0
Few Exemplar-Based General Medical Image Segmentation via Domain-Aware Selective Adaptation0
FlexiCrackNet: A Flexible Pipeline for Enhanced Crack Segmentation with General Features Transfered from SAM0
SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing Agents0
GALA: Generating Animatable Layered Assets from a Single Scan0
Generating Diverse Q&A Benchmarks for RAG Evaluation with DataMorgana0
Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers0
Generative Meta-Learning for Zero-Shot Relation Triplet Extraction0
Generative Retrieval and Alignment Model: A New Paradigm for E-commerce Retrieval0
GeoEdit: Geometric Knowledge Editing for Large Language Models0
GFDC: Graph Function Dependence for Logically Consistent Dialogue Response Beyond Persona Data0
GOT4Rec: Graph of Thoughts for Sequential Recommendation0
GRL-Prompt: Towards Knowledge Graph based Prompt Optimization via Reinforcement Learning0
Hierarchical Inductive Transfer for Continual Dialogue Learning0
Hierarchical Inductive Transfer for Continual Dialogue Learning0
How to Complete Domain Tuning while Keeping General Ability in LLM: Adaptive Layer-wise and Element-wise Regularization0
Igea: a Decoder-Only Language Model for Biomedical Text Generation in Italian0
Image Captioning and Visual Question Answering Based on Attributes and External Knowledge0
Improving Multi-label Emotion Classification by Integrating both General and Domain-specific Knowledge0
INCPrompt: Task-Aware incremental Prompting for Rehearsal-Free Class-incremental Learning0
Inductive Graph Alignment Prompt: Bridging the Gap between Graph Pre-training and Inductive Fine-tuning From Spectral Perspective0
A new algorithm for Subgroup Set Discovery based on Information Gain0
Insect-Foundation: A Foundation Model and Large Multimodal Dataset for Vision-Language Insect Understanding0
Integration of Imitation Learning using GAIL and Reinforcement Learning using Task-achievement Rewards via Probabilistic Graphical Model0
Intelligent Conversational Bot for Massive Online Open Courses (MOOCs)0
Intelligent Design 4.0: Paradigm Evolution Toward the Agentic AI Era0
Investigating Forgetting in Pre-Trained Representations Through Continual Learning0
Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI0
Joint Embedding Learning of Educational Knowledge Graphs0
Juru: Legal Brazilian Large Language Model from Reputable Sources0
Show:102550
← PrevPage 5 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy94.3Unverified
2Gopher-280B (few-shot, k=5)Accuracy93.9Unverified
3Chinchilla-70B (few-shot, k=5)Accuracy 85.7Unverified
4Gopher-280B (few-shot, k=5)Accuracy 84.8Unverified
5Gopher-280B (few-shot, k=5)Accuracy84.2Unverified
6Gopher-280B (few-shot, k=5)Accuracy 84.1Unverified
7Gopher-280B (few-shot, k=5)Accuracy 83.9Unverified
8Gopher-280B (few-shot, k=5)Accuracy83.3Unverified
9Gopher-280B (few-shot, k=5)Accuracy 81.8Unverified
10Gopher-280B (few-shot, k=5)Accuracy 81Unverified