| MSA Transformer | Feb 13, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification | Feb 1, 2021 | Language IdentificationLanguage Modeling | CodeCode Available | 0 |
| MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding | Jan 23, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CDLM: Cross-Document Language Modeling | Jan 2, 2021 | Citation RecommendationCoreference Resolution | CodeCode Available | 1 |
| Universal Sentence Representations Learning with Conditional Masked Language Model | Jan 1, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding | Dec 31, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Universal Sentence Representation Learning with Conditional Masked Language Model | Dec 28, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| RealFormer: Transformer Likes Residual Attention | Dec 21, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| TAP: Text-Aware Pre-training for Text-VQA and Text-Caption | Dec 8, 2020 | Caption GenerationLanguage Modeling | CodeCode Available | 1 |
| Pre-training Protein Language Models with Label-Agnostic Binding Pairs Enhances Performance in Downstream Tasks | Dec 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| XHate-999: Analyzing and Detecting Abusive Language Across Domains and Languages | Dec 1, 2020 | Abusive LanguageDisentanglement | —Unverified | 0 |
| StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling | Dec 1, 2020 | Constituency ParsingDependency Parsing | CodeCode Available | 1 |
| Profile Prediction: An Alignment-Based Pre-Training Task for Protein Sequence Models | Dec 1, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Self-Supervised Relationship Probing | Dec 1, 2020 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| Self-Supervised learning with cross-modal transformers for emotion recognition | Nov 20, 2020 | Emotion RecognitionLanguage Modeling | —Unverified | 0 |
| A Hierarchical Multi-Modal Encoder for Moment Localization in Video Corpus | Nov 18, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| POSTECH-ETRI’s Submission to the WMT2020 APE Shared Task: Automatic Post-Editing with Cross-lingual Language Model | Nov 1, 2020 | Automatic Post-EditingLanguage Modeling | —Unverified | 0 |
| Controlling the Imprint of Passivization and Negation in Contextualized Representations | Nov 1, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Effective Decoder Masking for Transformer Based End-to-End Speech Recognition | Oct 27, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| DICT-MLM: Improved Multilingual Pre-Training using Bilingual Dictionaries | Oct 23, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken Language Understanding | Oct 23, 2020 | cross-modal alignmentLanguage Modeling | —Unverified | 0 |
| ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding | Oct 23, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Cold-start Active Learning through Self-supervised Language Modeling | Oct 19, 2020 | Active LearningClassification | CodeCode Available | 1 |
| Corruption Is Not All Bad: Incorporating Discourse Structure into Pre-training via Corruption for Essay Scoring | Oct 13, 2020 | AllAutomated Essay Scoring | —Unverified | 0 |
| Cross-Thought for Sentence Encoder Pre-training | Oct 7, 2020 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding | Oct 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| XDA: Accurate, Robust Disassembly with Transfer Learning | Oct 2, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing | Sep 29, 2020 | Inductive BiasLanguage Modeling | CodeCode Available | 1 |
| VECO: Variable Encoder-decoder Pre-training for Cross-lingual Understanding and Generation | Sep 28, 2020 | DecoderLanguage Modeling | —Unverified | 0 |
| Deep Transformers with Latent Depth | Sep 28, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| GraphCodeBERT: Pre-training Code Representations with Data Flow | Sep 17, 2020 | Clone DetectionCode Completion | —Unverified | 0 |
| Intermediate Training of BERT for Product Matching | Aug 31, 2020 | Entity ResolutionLanguage Modeling | CodeCode Available | 1 |
| Learning Visual Representations with Caption Annotations | Aug 4, 2020 | Image CaptioningLanguage Modeling | —Unverified | 0 |
| TensorCoder: Dimension-Wise Attention via Tensor Representation for Natural Language Modeling | Jul 28, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| The Lottery Ticket Hypothesis for Pre-trained BERT Networks | Jul 23, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Language-agnostic BERT Sentence Embedding | Jul 3, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Pre-training via Paraphrasing | Jun 26, 2020 | Document SummarizationDocument Translation | CodeCode Available | 1 |
| I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths | Jun 18, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| MC-BERT: Efficient Language Pre-Training via a Meta Controller | Jun 10, 2020 | Binary ClassificationCloze Test | CodeCode Available | 1 |
| GMAT: Global Memory Augmentation for Transformers | Jun 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers | Jun 5, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Segatron: Segment-aware Transformer for Language Modeling and Understanding | Jun 2, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Position Masking for Language Models | Jun 2, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP | May 29, 2020 | Dependency ParsingLanguage Modeling | CodeCode Available | 1 |
| HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training | May 1, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Segatron: Segment-Aware Transformer for Language Modeling and Understanding | Apr 30, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning | Apr 29, 2020 | AllHellaSwag | —Unverified | 0 |
| UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection | Apr 23, 2020 | Domain AdaptationGeneral Classification | —Unverified | 0 |
| Train No Evil: Selective Masking for Task-Guided Pre-Training | Apr 21, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MPNet: Masked and Permuted Pre-training for Language Understanding | Apr 20, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 2 |