| HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training | May 1, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Segatron: Segment-Aware Transformer for Language Modeling and Understanding | Apr 30, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Train No Evil: Selective Masking for Task-Guided Pre-Training | Apr 21, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue | Apr 15, 2020 | Dialogue State TrackingIntent Detection | CodeCode Available | 1 |
| ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators | Mar 23, 2020 | GPULanguage Modeling | CodeCode Available | 1 |
| Talking-Heads Attention | Mar 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| REALM: Retrieval-Augmented Language Model Pre-Training | Feb 10, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| UNITER: UNiversal Image-TExt Representation Learning | Sep 25, 2019 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |
| LXMERT: Learning Cross-Modality Encoder Representations from Transformers | Aug 20, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Mask-Predict: Parallel Decoding of Conditional Masked Language Models | Apr 19, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |