| Understanding the Natural Language of DNA using Encoder-Decoder Foundation Models with Byte-level Precision | Nov 4, 2023 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| BPDec: Unveiling the Potential of Masked Language Modeling Decoder in BERT pretraining | Jan 29, 2024 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| Do Transformers Parse while Predicting the Masked Word? | Mar 14, 2023 | Constituency ParsingLanguage Modeling | —Unverified | 0 | 0 |
| On the Influence of Masking Policies in Intermediate Pre-training | Apr 18, 2021 | Abstractive Text SummarizationLanguage Modeling | —Unverified | 0 | 0 |
| OPSD: an Offensive Persian Social media Dataset and its baseline evaluations | Apr 8, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Mapping of attention mechanisms to a generalized Potts model | Apr 14, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling | Jan 25, 2024 | Causal Language ModelingDecoder | —Unverified | 0 | 0 |
| PASTA: Pretrained Action-State Transformer Agents | Jul 20, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Patton: Language Model Pretraining on Text-Rich Networks | May 20, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training | Aug 16, 2019 | Image-text matchingImage-text Retrieval | —Unverified | 0 | 0 |
| Domain-Specific Japanese ELECTRA Model Using a Small Corpus | Sep 1, 2021 | ArticlesComputational Efficiency | —Unverified | 0 | 0 |
| PerPLM: Personalized Fine-tuning of Pretrained Language Models via Writer-specific Intermediate Learning and Prompts | Sep 14, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Ankh3: Multi-Task Pretraining with Sequence Denoising and Completion Enhances Protein Representations | May 26, 2025 | DenoisingLanguage Modeling | —Unverified | 0 | 0 |
| Phrase-aware Unsupervised Constituency Parsing | Nov 16, 2021 | Constituency ParsingLanguage Modeling | —Unverified | 0 | 0 |
| Phrase-aware Unsupervised Constituency Parsing | May 1, 2022 | Constituency ParsingLanguage Modeling | —Unverified | 0 | 0 |
| Unified Multimodal Pre-training and Prompt-based Tuning for Vision-Language Understanding and Generation | Dec 10, 2021 | Image-text matchingImage-text Retrieval | —Unverified | 0 | 0 |
| Uniform Masking Prevails in Vision-Language Pretraining | Dec 10, 2022 | Image-text matchingLanguage Modeling | —Unverified | 0 | 0 |
| Domain-adapted large language models for classifying nuclear medicine reports | Mar 1, 2023 | Domain AdaptationLanguage Modeling | —Unverified | 0 | 0 |
| Position Masking for Language Models | Jun 2, 2020 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| POSTECH-ETRI’s Submission to the WMT2020 APE Shared Task: Automatic Post-Editing with Cross-lingual Language Model | Nov 1, 2020 | Automatic Post-EditingLanguage Modeling | —Unverified | 0 | 0 |
| Predicting Attention Sparsity in Transformers | Sep 24, 2021 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| Predicting Attention Sparsity in Transformers | Nov 16, 2021 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| Does Pre-training Induce Systematic Inference? How Masked Language Models Acquire Commonsense Knowledge | Dec 16, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Discovering Financial Hypernyms by Prompting Masked Language Models | Jun 1, 2022 | Domain AdaptationLanguage Modeling | —Unverified | 0 | 0 |
| Pre-Training and Prompting for Few-Shot Node Classification on Text-Attributed Graphs | Jul 22, 2024 | Few-Shot LearningGraph Neural Network | —Unverified | 0 | 0 |