| Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet | Jan 28, 2021 | image-classificationImage Classification | CodeCode Available | 2 |
| The Pile: An 800GB Dataset of Diverse Text for Language Modeling | Dec 31, 2020 | DiversityLanguage Modeling | CodeCode Available | 2 |
| Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification | Oct 26, 2020 | Few-Shot Text ClassificationGeneral Classification | CodeCode Available | 2 |
| AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients | Oct 15, 2020 | image-classificationImage Classification | CodeCode Available | 2 |
| Mirostat: A Neural Text Decoding Algorithm that Directly Controls Perplexity | Jul 29, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Simplifying Paragraph-level Question Generation via Transformer Language Models | May 3, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MPNet: Masked and Permuted Pre-training for Language Understanding | Apr 20, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| BAE: BERT-based Adversarial Examples for Text Classification | Apr 4, 2020 | Adversarial AttackAdversarial Text | CodeCode Available | 2 |
| Self-Supervised Log Parsing | Mar 17, 2020 | Anomaly DetectionFault Detection | CodeCode Available | 2 |
| CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model | Mar 3, 2020 | 8kLanguage Modeling | CodeCode Available | 2 |