| TensorCoder: Dimension-Wise Attention via Tensor Representation for Natural Language Modeling | Jul 28, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths | Jun 18, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| GMAT: Global Memory Augmentation for Transformers | Jun 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers | Jun 5, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Segatron: Segment-aware Transformer for Language Modeling and Understanding | Jun 2, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Position Masking for Language Models | Jun 2, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning | Apr 29, 2020 | AllHellaSwag | —Unverified | 0 |
| UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection | Apr 23, 2020 | Domain AdaptationGeneral Classification | —Unverified | 0 |
| XGPT: Cross-modal Generative Pre-Training for Image Captioning | Mar 3, 2020 | Data AugmentationDenoising | —Unverified | 0 |
| ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data | Jan 22, 2020 | Image RetrievalImage-text matching | —Unverified | 0 |