| Language Models are Few-Shot Learners | May 28, 2020 | answerability predictionArticles | CodeCode Available | 3 | 5 |
| PaLM: Scaling Language Modeling with Pathways | Apr 5, 2022 | Auto DebuggingCode Generation | CodeCode Available | 2 | 5 |
| Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks | Jan 5, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 2 | 5 |
| DeBERTa: Decoding-enhanced BERT with Disentangled Attention | Jun 5, 2020 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 2 | 5 |
| Crosslingual Generalization through Multitask Finetuning | Nov 3, 2022 | Coreference ResolutionCross-Lingual Transfer | CodeCode Available | 2 | 5 |
| Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning | Oct 10, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions | Apr 27, 2023 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 2 | 5 |
| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 | 5 |
| The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning | May 23, 2023 | Common Sense ReasoningCommon Sense Reasoning (Zero-Shot) | CodeCode Available | 2 | 5 |
| HONEST: Measuring Hurtful Sentence Completion in Language Models | Jun 1, 2021 | Hate Speech DetectionHurtful Sentence Completion | CodeCode Available | 1 | 5 |