| Parameterized Synthetic Text Generation with SimpleStories | Apr 12, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 |
| Gated Linear Attention Transformers with Hardware-Efficient Training | Dec 11, 2023 | 2kLanguage Modeling | CodeCode Available | 1 |
| Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels | Mar 23, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Gemstones: A Model Suite for Multi-Faceted Scaling Laws | Feb 7, 2025 | Experimental DesignLanguage Modeling | CodeCode Available | 1 |
| Exploring Stochastic Autoregressive Image Modeling for Visual Representation | Dec 3, 2022 | DecoderLanguage Modeling | CodeCode Available | 1 |
| Generalization through Memorization: Nearest Neighbor Language Models | Nov 1, 2019 | Domain AdaptationLanguage Modeling | CodeCode Available | 1 |
| Generated Knowledge Prompting for Commonsense Reasoning | Oct 15, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Exploring the Limits of Language Modeling | Feb 7, 2016 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations | Jul 10, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Exploring Quantization for Efficient Pre-Training of Transformer Language Models | Jul 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |