| Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzer | Jan 13, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Improving Transformer Optimization Through Better Initialization | Jan 1, 2020 | DecoderLanguage Modeling | CodeCode Available | 1 |
| Improving Transformer Optimization Through Better Initialization | Jan 1, 2020 | DecoderLanguage Modeling | CodeCode Available | 1 |
| BERTje: A Dutch BERT Model | Dec 19, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Open Domain Web Keyphrase Extraction Beyond Language Modeling | Nov 6, 2019 | Keyphrase ExtractionLanguage Modeling | CodeCode Available | 1 |
| Unsupervised Cross-lingual Representation Learning at Scale | Nov 5, 2019 | Cross-Lingual TransferLanguage Modeling | CodeCode Available | 1 |
| Generalization through Memorization: Nearest Neighbor Language Models | Nov 1, 2019 | Domain AdaptationLanguage Modeling | CodeCode Available | 1 |
| Masked Language Model Scoring | Oct 31, 2019 | AttributeDomain Adaptation | CodeCode Available | 1 |
| Multi-Stage Document Ranking with BERT | Oct 31, 2019 | Document RankingLanguage Modeling | CodeCode Available | 1 |
| Stabilizing Transformers for Reinforcement Learning | Oct 13, 2019 | General Reinforcement LearningLanguage Modeling | CodeCode Available | 1 |
| Structured Pruning of Large Language Models | Oct 10, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter | Oct 2, 2019 | Hate Speech DetectionKnowledge Distillation | CodeCode Available | 1 |
| UNITER: UNiversal Image-TExt Representation Learning | Sep 25, 2019 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |
| Reducing Transformer Depth on Demand with Structured Dropout | Sep 25, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| A Critical Analysis of Biased Parsers in Unsupervised Parsing | Sep 20, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Espresso: A Fast End-to-end Neural Speech Recognition Toolkit | Sep 18, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Ouroboros: On Accelerating Training of Transformer-Based Language Models | Sep 14, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CTRL: A Conditional Transformer Language Model for Controllable Generation | Sep 11, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MultiFiT: Efficient Multi-lingual Language Model Fine-tuning | Sep 10, 2019 | Cross-Lingual Document ClassificationDocument Classification | CodeCode Available | 1 |
| Improved Hierarchical Patient Classification with Language Model Pretraining over Clinical Notes | Sep 6, 2019 | General ClassificationLanguage Modeling | CodeCode Available | 1 |
| The Woman Worked as a Babysitter: On Biases in Language Generation | Sep 3, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Deep Equilibrium Models | Sep 3, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LXMERT: Learning Cross-Modality Encoder Representations from Transformers | Aug 20, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| VisualBERT: A Simple and Performant Baseline for Vision and Language | Aug 9, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| On the Variance of the Adaptive Learning Rate and Beyond | Aug 8, 2019 | image-classificationImage Classification | CodeCode Available | 1 |