| LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning | Mar 26, 2024 | GPUGSM8K | CodeCode Available | 9 | 5 |
| LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding | Apr 25, 2024 | GSM8KHellaSwag | CodeCode Available | 3 | 5 |
| ST-MoE: Designing Stable and Transferable Sparse Expert Models | Feb 17, 2022 | ARCCommon Sense Reasoning | CodeCode Available | 3 | 5 |
| Scaling Language Models: Methods, Analysis & Insights from Training Gopher | Dec 8, 2021 | Abstract AlgebraAnachronisms | CodeCode Available | 2 | 5 |
| Bridging the Gap: Enhancing LLM Performance for Low-Resource African Languages with New Benchmarks, Fine-Tuning, and Cultural Adjustments | Dec 16, 2024 | Clinical KnowledgeCollege Medicine | CodeCode Available | 1 | 5 |
| UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark | Mar 24, 2021 | Common Sense ReasoningHellaSwag | CodeCode Available | 1 | 5 |
| WinoGrande: An Adversarial Winograd Schema Challenge at Scale | Jul 24, 2019 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 | 5 |
| Generative Data Augmentation for Commonsense Reasoning | Apr 24, 2020 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 | 5 |
| On Curriculum Learning for Commonsense Reasoning | Jul 1, 2022 | HellaSwagLearning-To-Rank | CodeCode Available | 0 | 5 |
| Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations in a Label-Abundant Setup | Dec 12, 2021 | Natural Language InferenceTransfer Learning | CodeCode Available | 0 | 5 |