| SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding | Oct 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| XDA: Accurate, Robust Disassembly with Transfer Learning | Oct 2, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing | Sep 29, 2020 | Inductive BiasLanguage Modeling | CodeCode Available | 1 |
| VECO: Variable Encoder-decoder Pre-training for Cross-lingual Understanding and Generation | Sep 28, 2020 | DecoderLanguage Modeling | —Unverified | 0 |
| Deep Transformers with Latent Depth | Sep 28, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| GraphCodeBERT: Pre-training Code Representations with Data Flow | Sep 17, 2020 | Clone DetectionCode Completion | —Unverified | 0 |
| Intermediate Training of BERT for Product Matching | Aug 31, 2020 | Entity ResolutionLanguage Modeling | CodeCode Available | 1 |
| Learning Visual Representations with Caption Annotations | Aug 4, 2020 | Image CaptioningLanguage Modeling | —Unverified | 0 |
| TensorCoder: Dimension-Wise Attention via Tensor Representation for Natural Language Modeling | Jul 28, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| The Lottery Ticket Hypothesis for Pre-trained BERT Networks | Jul 23, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Language-agnostic BERT Sentence Embedding | Jul 3, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Pre-training via Paraphrasing | Jun 26, 2020 | Document SummarizationDocument Translation | CodeCode Available | 1 |
| I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths | Jun 18, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| MC-BERT: Efficient Language Pre-Training via a Meta Controller | Jun 10, 2020 | Binary ClassificationCloze Test | CodeCode Available | 1 |
| GMAT: Global Memory Augmentation for Transformers | Jun 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers | Jun 5, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Segatron: Segment-aware Transformer for Language Modeling and Understanding | Jun 2, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Position Masking for Language Models | Jun 2, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP | May 29, 2020 | Dependency ParsingLanguage Modeling | CodeCode Available | 1 |
| HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training | May 1, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Segatron: Segment-Aware Transformer for Language Modeling and Understanding | Apr 30, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning | Apr 29, 2020 | AllHellaSwag | —Unverified | 0 |
| UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection | Apr 23, 2020 | Domain AdaptationGeneral Classification | —Unverified | 0 |
| Train No Evil: Selective Masking for Task-Guided Pre-Training | Apr 21, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MPNet: Masked and Permuted Pre-training for Language Understanding | Apr 20, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 2 |