| Structured Pruning of Large Language Models | Oct 10, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter | Oct 2, 2019 | Hate Speech DetectionKnowledge Distillation | CodeCode Available | 1 |
| UNITER: UNiversal Image-TExt Representation Learning | Sep 25, 2019 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |
| Reducing Transformer Depth on Demand with Structured Dropout | Sep 25, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| A Critical Analysis of Biased Parsers in Unsupervised Parsing | Sep 20, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Espresso: A Fast End-to-end Neural Speech Recognition Toolkit | Sep 18, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Ouroboros: On Accelerating Training of Transformer-Based Language Models | Sep 14, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CTRL: A Conditional Transformer Language Model for Controllable Generation | Sep 11, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MultiFiT: Efficient Multi-lingual Language Model Fine-tuning | Sep 10, 2019 | Cross-Lingual Document ClassificationDocument Classification | CodeCode Available | 1 |
| Improved Hierarchical Patient Classification with Language Model Pretraining over Clinical Notes | Sep 6, 2019 | General ClassificationLanguage Modeling | CodeCode Available | 1 |