| POS-BERT: Point Cloud One-Stage BERT Pre-Training | Apr 3, 2022 | Contrastive LearningLanguage Modeling | CodeCode Available | 1 | 5 |
| DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning | May 17, 2023 | ClusteringLanguage Modeling | CodeCode Available | 1 | 5 |
| Language-agnostic BERT Sentence Embedding | Jul 3, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Train No Evil: Selective Masking for Task-Guided Pre-Training | Apr 21, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Labrador: Exploring the Limits of Masked Language Modeling for Laboratory Data | Dec 9, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Knowledge Perceived Multi-modal Pretraining in E-commerce | Aug 20, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| SecureBERT: A Domain-Specific Language Model for Cybersecurity | Apr 6, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| DomURLs_BERT: Pre-trained BERT-based Model for Malicious Domains and URLs Detection and Classification | Sep 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Unified Multimodal Model with Unlikelihood Training for Visual Dialog | Nov 23, 2022 | Answer GenerationChatbot | CodeCode Available | 1 | 5 |
| LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling | Jun 14, 2022 | DecoderLanguage Modeling | CodeCode Available | 1 | 5 |
| MVPTR: Multi-Level Semantic Alignment for Vision-Language Pre-Training via Multi-Stage Learning | Jan 29, 2022 | Image-text matchingLanguage Modeling | CodeCode Available | 1 | 5 |
| Long-context Protein Language Modeling Using Bidirectional Mamba with Shared Projection Layers | Oct 29, 2024 | Drug DesignLanguage Modeling | CodeCode Available | 1 | 5 |
| Mixture of Attention Heads: Selecting Attention Heads Per Token | Oct 11, 2022 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 | 5 |
| What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? | Apr 12, 2022 | DecoderLanguage Modeling | CodeCode Available | 1 | 5 |
| NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents | Feb 27, 2024 | Document ClassificationLanguage Modeling | CodeCode Available | 1 | 5 |
| Luna: Linear Unified Nested Attention | Jun 3, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training | Dec 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model | Oct 11, 2022 | Contrastive LearningImage-text matching | CodeCode Available | 1 | 5 |
| A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models | Oct 16, 2021 | Image CaptioningLanguage Modeling | CodeCode Available | 1 | 5 |
| Massive Choice, Ample Tasks (MaChAmp): A Toolkit for Multi-task Learning in NLP | May 29, 2020 | Dependency ParsingLanguage Modeling | CodeCode Available | 1 | 5 |
| CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking | Feb 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking | Dec 15, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators | Mar 23, 2020 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| Eliciting Knowledge from Pretrained Language Models for Prototypical Prompt Verbalizer | Jan 14, 2022 | ClassificationContrastive Learning | CodeCode Available | 1 | 5 |
| Cold-start Active Learning through Self-supervised Language Modeling | Oct 19, 2020 | Active LearningClassification | CodeCode Available | 1 | 5 |
| MC-BERT: Efficient Language Pre-Training via a Meta Controller | Jun 10, 2020 | Binary ClassificationCloze Test | CodeCode Available | 1 | 5 |
| Composable Sparse Fine-Tuning for Cross-Lingual Transfer | Oct 14, 2021 | Cross-Lingual TransferLanguage Modeling | CodeCode Available | 1 | 5 |
| Endowing Protein Language Models with Structural Knowledge | Jan 26, 2024 | Drug DesignLanguage Modeling | CodeCode Available | 1 | 5 |
| Frustratingly Simple Pretraining Alternatives to Masked Language Modeling | Sep 4, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Generative power of a protein language model trained on multiple sequence alignments | Apr 14, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Mask-Predict: Parallel Decoding of Conditional Masked Language Models | Apr 19, 2019 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MMBERT: Multimodal BERT Pretraining for Improved Medical VQA | Apr 3, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Generate to Understand for Representation | Jun 14, 2023 | Contrastive LearningGPU | CodeCode Available | 1 | 5 |
| MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding | Mar 11, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MicroBERT: Effective Training of Low-resource Monolingual BERTs through Parameter Reduction and Multitask Learning | Dec 23, 2022 | Dependency ParsingLanguage Modeling | CodeCode Available | 1 | 5 |
| Nonparametric Masked Language Modeling | Dec 2, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Stochastic positional embeddings improve masked image modeling | Jul 31, 2023 | Language ModellingMasked Language Modeling | CodeCode Available | 1 | 5 |
| GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding | Oct 23, 2023 | ArticlesContrastive Learning | CodeCode Available | 1 | 5 |
| Contextual Representation Learning beyond Masked Language Modeling | Apr 8, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Model | May 19, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| CodeEditor: Learning to Edit Source Code with Pre-trained Models | Oct 31, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality | Feb 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More | Feb 11, 2025 | DecoderInformation Retrieval | CodeCode Available | 0 | 5 |
| Arabic Synonym BERT-based Adversarial Examples for Text Classification | Feb 5, 2024 | Adversarial TextLanguage Modeling | CodeCode Available | 0 | 5 |
| Masked Latent Semantic Modeling: an Efficient Pre-training Alternative to Masked Language Modeling | Jul 7, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Masked and Permuted Implicit Context Learning for Scene Text Recognition | May 25, 2023 | DecoderLanguage Modeling | CodeCode Available | 0 | 5 |
| Masked Language Modeling for Proteins via Linearly Scalable Long-Context Transformers | Jun 5, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| DS-TOD: Efficient Domain Specialization for Task-Oriented Dialog | May 1, 2022 | dialog state trackingLanguage Modeling | CodeCode Available | 0 | 5 |
| Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways | Oct 26, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| DS-TOD: Efficient Domain Specialization for Task Oriented Dialog | Oct 15, 2021 | dialog state trackingLanguage Modeling | CodeCode Available | 0 | 5 |