| MELM: Data Augmentation with Masked Entity Language Modeling for Low-Resource NER | Aug 31, 2021 | Cross-Lingual NERData Augmentation | CodeCode Available | 1 |
| Sentence Bottleneck Autoencoders from Transformer Language Models | Aug 31, 2021 | DecoderDenoising | CodeCode Available | 1 |
| Knowledge Perceived Multi-modal Pretraining in E-commerce | Aug 20, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification | Aug 4, 2021 | ClassificationFew-Shot Text Classification | CodeCode Available | 1 |
| SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs | Jun 18, 2021 | DecoderKnowledge Graphs | CodeCode Available | 1 |
| Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment | Jun 11, 2021 | DenoisingLanguage Modeling | CodeCode Available | 1 |
| Luna: Linear Unified Nested Attention | Jun 3, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| TreeBERT: A Tree-Based Pre-Trained Model for Programming Language | May 26, 2021 | Code SummarizationLanguage Modeling | CodeCode Available | 1 |
| KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction | Apr 15, 2021 | Dialog Relation ExtractionLanguage Modeling | CodeCode Available | 1 |
| On the Inductive Bias of Masked Language Modeling: From Statistical to Syntactic Dependencies | Apr 12, 2021 | Inductive BiasLanguage Modeling | CodeCode Available | 1 |
| MMBERT: Multimodal BERT Pretraining for Improved Medical VQA | Apr 3, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation | Mar 18, 2021 | Bilingual Lexicon InductionLanguage Modeling | CodeCode Available | 1 |
| MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding | Mar 11, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CDLM: Cross-Document Language Modeling | Jan 2, 2021 | Citation RecommendationCoreference Resolution | CodeCode Available | 1 |
| AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding | Dec 31, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| RealFormer: Transformer Likes Residual Attention | Dec 21, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| TAP: Text-Aware Pre-training for Text-VQA and Text-Caption | Dec 8, 2020 | Caption GenerationLanguage Modeling | CodeCode Available | 1 |
| Pre-training Protein Language Models with Label-Agnostic Binding Pairs Enhances Performance in Downstream Tasks | Dec 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling | Dec 1, 2020 | Constituency ParsingDependency Parsing | CodeCode Available | 1 |
| Cold-start Active Learning through Self-supervised Language Modeling | Oct 19, 2020 | Active LearningClassification | CodeCode Available | 1 |
| Cross-Thought for Sentence Encoder Pre-training | Oct 7, 2020 | Information RetrievalLanguage Modeling | CodeCode Available | 1 |
| SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding | Oct 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| XDA: Accurate, Robust Disassembly with Transfer Learning | Oct 2, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing | Sep 29, 2020 | Inductive BiasLanguage Modeling | CodeCode Available | 1 |
| Intermediate Training of BERT for Product Matching | Aug 31, 2020 | Entity ResolutionLanguage Modeling | CodeCode Available | 1 |
| The Lottery Ticket Hypothesis for Pre-trained BERT Networks | Jul 23, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Language-agnostic BERT Sentence Embedding | Jul 3, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Pre-training via Paraphrasing | Jun 26, 2020 | Document SummarizationDocument Translation | CodeCode Available | 1 |
| MC-BERT: Efficient Language Pre-Training via a Meta Controller | Jun 10, 2020 | Binary ClassificationCloze Test | CodeCode Available | 1 |
| Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP | May 29, 2020 | Dependency ParsingLanguage Modeling | CodeCode Available | 1 |
| HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training | May 1, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Segatron: Segment-Aware Transformer for Language Modeling and Understanding | Apr 30, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Train No Evil: Selective Masking for Task-Guided Pre-Training | Apr 21, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue | Apr 15, 2020 | Dialogue State TrackingIntent Detection | CodeCode Available | 1 |
| ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators | Mar 23, 2020 | GPULanguage Modeling | CodeCode Available | 1 |
| Talking-Heads Attention | Mar 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| REALM: Retrieval-Augmented Language Model Pre-Training | Feb 10, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| UNITER: UNiversal Image-TExt Representation Learning | Sep 25, 2019 | Image-text matchingImage-text Retrieval | CodeCode Available | 1 |
| LXMERT: Learning Cross-Modality Encoder Representations from Transformers | Aug 20, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Mask-Predict: Parallel Decoding of Conditional Masked Language Models | Apr 19, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| GeoRecon: Graph-Level Representation Learning for 3D Molecules via Reconstruction-Based Pretraining | Jun 16, 2025 | DenoisingLanguage Modeling | —Unverified | 0 |
| Masked Language Models are Good Heterogeneous Graph Generalizers | Jun 6, 2025 | Graph LearningLanguage Modeling | CodeCode Available | 0 |
| Improving Low-Resource Morphological Inflection via Self-Supervised Objectives | Jun 5, 2025 | DecoderLanguage Modeling | —Unverified | 0 |
| HAD: Hybrid Architecture Distillation Outperforms Teacher in Genomic Sequence Modeling | May 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Ankh3: Multi-Task Pretraining with Sequence Denoising and Completion Enhances Protein Representations | May 26, 2025 | DenoisingLanguage Modeling | —Unverified | 0 |
| ADALog: Adaptive Unsupervised Anomaly detection in Logs with Self-attention Masked Language Model | May 15, 2025 | Anomaly DetectionLanguage Modeling | —Unverified | 0 |
| CodeSSM: Towards State Space Models for Code Understanding | May 2, 2025 | Clone DetectionLanguage Modeling | —Unverified | 0 |
| In-Context Learning can distort the relationship between sequence likelihoods and biological fitness | Apr 23, 2025 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| Low-Resource Transliteration for Roman-Urdu and Urdu Using Transformer-Based Models | Mar 27, 2025 | Information RetrievalLanguage Modeling | —Unverified | 0 |
| Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them | Mar 27, 2025 | Continual PretrainingLanguage Modeling | —Unverified | 0 |