Common Sense Reasoning
Common sense reasoning tasks are intended to require the model to go beyond pattern recognition. Instead, the model should use "common sense" or world knowledge to make inferences.
Papers
Showing 1–10 of 939 papers
All datasetsWinoGrandearc_challengearc_easyReCoRDCommonsenseQAPARusRuCoSRWSDBIG-bench (Causal Judgment)BIG-bench (Date Understanding)BIG-bench (Disambiguation QA)BIG-bench (Sports Understanding)
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | PaLM 2 (few-shot, k=3, Direct) | Accuracy | 78.8 | — | Unverified |
| 2 | PaLM 2 (few-shot, k=3, CoT) | Accuracy | 77.6 | — | Unverified |
| 3 | PaLM 540B (few-shot, k=3) | Accuracy | 60.8 | — | Unverified |
| 4 | Chinchilla-70B (few-shot, k=5) | Accuracy | 54.7 | — | Unverified |
| 5 | Gopher-280B (few-shot, k=5) | Accuracy | 45.5 | — | Unverified |
| 6 | GPT-NeoX 20B (few-shot, k=3) | Accuracy | 40.8 | — | Unverified |
| 7 | OPT 66B (few-shot, k=3) | Accuracy | 40.4 | — | Unverified |
| 8 | BLOOM 176B (few-shot, k=3) | Accuracy | 40.4 | — | Unverified |
| 9 | Bloomberg GPT 50B (few-shot, k=3) | Accuracy | 34 | — | Unverified |