SOTAVerified

Common Sense Reasoning

Common sense reasoning tasks are intended to require the model to go beyond pattern recognition. Instead, the model should use "common sense" or world knowledge to make inferences.

Papers

Showing 110 of 939 papers

TitleStatusHype
Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes0
LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and OptimizationCode0
CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation0
EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits0
Prime the search: Using large language models for guiding geometric task and motion planning by warm-starting tree searchCode0
AmbiK: Dataset of Ambiguous Tasks in Kitchen EnvironmentCode0
ATLAS: Learning to Optimally Memorize the Context at Test Time0
Spatial Knowledge Graph-Guided Multimodal Synthesis0
CaseEdit: Enhancing Localized Commonsense Reasoning via Null-Space Constrained Knowledge Editing in Small Parameter Language Models0
Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation0
Show:102550
← PrevPage 1 of 94Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ST-MoE-32B 269B (fine-tuned)Accuracy95.2Unverified
2LLaMA 3 8B+MoSLoRA (fine-tuned)Accuracy90.5Unverified
3PaLM 2-L (1-shot)Accuracy89.7Unverified
4PaLM 2-M (1-shot)Accuracy88Unverified
5LLaMA-3 8B + MixLoRAAccuracy86.5Unverified
6Camelidae-8×34BAccuracy86.2Unverified
7PaLM 2-S (1-shot)Accuracy85.6Unverified
8LLaMA 65B + CFG (0-shot)Accuracy84.2Unverified
9GAL 120B (0-shot)Accuracy83.8Unverified
10LLaMA-2 13B + MixLoRAAccuracy83.5Unverified