| Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzer | Jan 13, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Improving Transformer Optimization Through Better Initialization | Jan 1, 2020 | DecoderLanguage Modeling | CodeCode Available | 1 |
| Improving Transformer Optimization Through Better Initialization | Jan 1, 2020 | DecoderLanguage Modeling | CodeCode Available | 1 |
| BERTje: A Dutch BERT Model | Dec 19, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Open Domain Web Keyphrase Extraction Beyond Language Modeling | Nov 6, 2019 | Keyphrase ExtractionLanguage Modeling | CodeCode Available | 1 |
| Unsupervised Cross-lingual Representation Learning at Scale | Nov 5, 2019 | Cross-Lingual TransferLanguage Modeling | CodeCode Available | 1 |
| Generalization through Memorization: Nearest Neighbor Language Models | Nov 1, 2019 | Domain AdaptationLanguage Modeling | CodeCode Available | 1 |
| Masked Language Model Scoring | Oct 31, 2019 | AttributeDomain Adaptation | CodeCode Available | 1 |
| Multi-Stage Document Ranking with BERT | Oct 31, 2019 | Document RankingLanguage Modeling | CodeCode Available | 1 |
| Stabilizing Transformers for Reinforcement Learning | Oct 13, 2019 | General Reinforcement LearningLanguage Modeling | CodeCode Available | 1 |
| Structured Pruning of Large Language Models | Oct 10, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter | Oct 2, 2019 | Hate Speech DetectionKnowledge Distillation | CodeCode Available | 1 |
| Reducing Transformer Depth on Demand with Structured Dropout | Sep 25, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| UNITER: UNiversal Image-TExt Representation Learning | Sep 25, 2019 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |
| A Critical Analysis of Biased Parsers in Unsupervised Parsing | Sep 20, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Espresso: A Fast End-to-end Neural Speech Recognition Toolkit | Sep 18, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Ouroboros: On Accelerating Training of Transformer-Based Language Models | Sep 14, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CTRL: A Conditional Transformer Language Model for Controllable Generation | Sep 11, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MultiFiT: Efficient Multi-lingual Language Model Fine-tuning | Sep 10, 2019 | Cross-Lingual Document ClassificationDocument Classification | CodeCode Available | 1 |
| Improved Hierarchical Patient Classification with Language Model Pretraining over Clinical Notes | Sep 6, 2019 | General ClassificationLanguage Modeling | CodeCode Available | 1 |
| The Woman Worked as a Babysitter: On Biases in Language Generation | Sep 3, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Deep Equilibrium Models | Sep 3, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LXMERT: Learning Cross-Modality Encoder Representations from Transformers | Aug 20, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| VisualBERT: A Simple and Performant Baseline for Vision and Language | Aug 9, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| On the Variance of the Adaptive Learning Rate and Beyond | Aug 8, 2019 | image-classificationImage Classification | CodeCode Available | 1 |
| RoBERTa: A Robustly Optimized BERT Pretraining Approach | Jul 26, 2019 | Common Sense ReasoningDocument Image Classification | CodeCode Available | 1 |
| ELI5: Long Form Question Answering | Jul 22, 2019 | FormLanguage Modeling | CodeCode Available | 1 |
| Hello, It's GPT-2 -- How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems | Jul 12, 2019 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Evaluating Language Model Finetuning Techniques for Low-resource Languages | Jun 30, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| A Tensorized Transformer for Language Modeling | Jun 24, 2019 | DecoderLanguage Modeling | CodeCode Available | 1 |
| XLNet: Generalized Autoregressive Pretraining for Language Understanding | Jun 19, 2019 | Audio Question AnsweringChinese Reading Comprehension | CodeCode Available | 1 |
| How multilingual is Multilingual BERT? | Jun 4, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation | Jun 2, 2019 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 1 |
| Adapting Text Embeddings for Causal Inference | May 29, 2019 | Causal IdentificationCausal Inference | CodeCode Available | 1 |
| Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks | May 27, 2019 | General Classificationimage-classification | CodeCode Available | 1 |
| Discrete Flows: Invertible Generative Models of Discrete Data | May 24, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Adaptive Attention Span in Transformers | May 19, 2019 | 8kLanguage Modeling | CodeCode Available | 1 |
| A Surprisingly Robust Trick for Winograd Schema Challenge | May 15, 2019 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 |
| How to Fine-Tune BERT for Text Classification? | May 14, 2019 | General ClassificationLanguage Modeling | CodeCode Available | 1 |
| RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation | May 8, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| The Curious Case of Neural Text Degeneration | Apr 22, 2019 | DiversityLanguage Modeling | CodeCode Available | 1 |
| Mask-Predict: Parallel Decoding of Conditional Masked Language Models | Apr 19, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition | Apr 18, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| fairseq: A Fast, Extensible Toolkit for Sequence Modeling | Apr 1, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| SciBERT: A Pretrained Language Model for Scientific Text | Mar 26, 2019 | Citation Intent ClassificationDependency Parsing | CodeCode Available | 1 |
| A Fully Differentiable Beam Search Decoder | Feb 16, 2019 | DecoderLanguage Modeling | CodeCode Available | 1 |
| Language Models are Unsupervised Multitask Learners | Feb 14, 2019 | Common Sense ReasoningCoreference Resolution | CodeCode Available | 1 |
| Pay Less Attention with Lightweight and Dynamic Convolutions | Jan 29, 2019 | Abstractive Text SummarizationLanguage Modeling | CodeCode Available | 1 |
| Passage Re-ranking with BERT | Jan 13, 2019 | Language ModelingPassage Re-Ranking | CodeCode Available | 1 |
| Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context | Jan 9, 2019 | ArticlesLanguage Modeling | CodeCode Available | 1 |