| LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning | Mar 26, 2024 | GPUGSM8K | CodeCode Available | 9 |
| Who's Harry Potter? Approximate Unlearning in LLMs | Oct 3, 2023 | ARCGPU | —Unverified | 0 |
| Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations | Nov 14, 2022 | Winogrande | CodeCode Available | 0 |
| On Curriculum Learning for Commonsense Reasoning | Jul 1, 2022 | HellaSwagLearning-To-Rank | CodeCode Available | 0 |
| A Warm Start and a Clean Crawled Corpus - A Recipe for Good Language Models | Jun 1, 2022 | Constituency ParsingGrammatical Error Detection | —Unverified | 0 |
| ST-MoE: Designing Stable and Transferable Sparse Expert Models | Feb 17, 2022 | ARCCommon Sense Reasoning | CodeCode Available | 3 |
| An Application of Pseudo-Log-Likelihoods to Natural Language Scoring | Jan 23, 2022 | Common Sense ReasoningGPU | —Unverified | 0 |
| A Warm Start and a Clean Crawled Corpus -- A Recipe for Good Language Models | Jan 14, 2022 | Constituency ParsingGrammatical Error Detection | —Unverified | 0 |
| Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations in a Label-Abundant Setup | Dec 12, 2021 | Natural Language InferenceTransfer Learning | CodeCode Available | 0 |
| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 |