| GLIPv2: Unifying Localization and Vision-Language Understanding | Jun 12, 2022 | 2D Object DetectionContrastive Learning | CodeCode Available | 4 | 5 |
| Simple and Effective Masked Diffusion Language Models | Jun 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 | 5 |
| GigaAM: Efficient Self-Supervised Learner for Speech Recognition | Jun 1, 2025 | Automatic Speech RecognitionLanguage Modeling | CodeCode Available | 4 | 5 |
| Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss | Aug 5, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 4 | 5 |
| Cramming: Training a Language Model on a Single GPU in One Day | Dec 28, 2022 | GPULanguage Modeling | CodeCode Available | 3 | 5 |
| ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding | Oct 23, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray | Feb 7, 2025 | 4kGeneral Knowledge | CodeCode Available | 3 | 5 |
| W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training | Aug 7, 2021 | Contrastive LearningLanguage Modeling | CodeCode Available | 3 | 5 |
| GPT or BERT: why not both? | Oct 31, 2024 | Causal Language ModelingLanguage Modeling | CodeCode Available | 2 | 5 |
| Retrieval Oriented Masking Pre-training Language Model for Dense Passage Retrieval | Oct 27, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder | May 24, 2022 | DecoderInformation Retrieval | CodeCode Available | 2 | 5 |
| Deep Bidirectional Language-Knowledge Graph Pretraining | Oct 17, 2022 | Common Sense ReasoningKnowledge Graphs | CodeCode Available | 2 | 5 |
| LinkBERT: Pretraining Language Models with Document Links | Mar 29, 2022 | Document ClassificationLanguage Modeling | CodeCode Available | 2 | 5 |
| MPNet: Masked and Permuted Pre-training for Language Understanding | Apr 20, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval | Mar 22, 2023 | Image-text matchingLanguage Modeling | CodeCode Available | 2 | 5 |
| MosaicBERT: A Bidirectional Encoder Optimized for Fast Pretraining | Dec 29, 2023 | GPULanguage Modeling | CodeCode Available | 2 | 5 |
| Self-Supervised Log Parsing | Mar 17, 2020 | Anomaly DetectionFault Detection | CodeCode Available | 2 | 5 |
| BMFM-RNA: An Open Framework for Building and Evaluating Transcriptomic Foundation Models | Jun 17, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 2 | 5 |
| A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models | Oct 16, 2021 | Image CaptioningLanguage Modeling | CodeCode Available | 1 | 5 |
| Debiasing the Cloze Task in Sequential Recommendation with Bidirectional Transformers | Jan 22, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Declaration-based Prompt Tuning for Visual Question Answering | May 5, 2022 | Image-text matchingLanguage Modeling | CodeCode Available | 1 | 5 |
| Generative Prompt Tuning for Relation Classification | Oct 22, 2022 | ClassificationLanguage Modeling | CodeCode Available | 1 | 5 |
| CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations | Sep 1, 2021 | Emotion ClassificationLanguage Modeling | CodeCode Available | 1 | 5 |
| CDLM: Cross-Document Language Modeling | Jan 2, 2021 | Citation RecommendationCoreference Resolution | CodeCode Available | 1 | 5 |
| Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training | Jun 1, 2022 | Contrastive LearningCross-Lingual Transfer | CodeCode Available | 1 | 5 |
| Cross-Thought for Sentence Encoder Pre-training | Oct 7, 2020 | Information RetrievalLanguage Modeling | CodeCode Available | 1 | 5 |
| Generative power of a protein language model trained on multiple sequence alignments | Apr 14, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Data Efficient Masked Language Modeling for Vision and Language | Sep 5, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding | Oct 23, 2023 | ArticlesContrastive Learning | CodeCode Available | 1 | 5 |
| AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding | Dec 31, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Global and Local Semantic Completion Learning for Vision-Language Pre-training | Jun 12, 2023 | cross-modal alignmentImage-text Retrieval | CodeCode Available | 1 | 5 |
| FiLM: Fill-in Language Models for Any-Order Generation | Oct 15, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market Domain | May 20, 2023 | De-identificationLanguage Modeling | CodeCode Available | 1 | 5 |
| Fine-grained Audible Video Description | Mar 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Endowing Protein Language Models with Structural Knowledge | Jan 26, 2024 | Drug DesignLanguage Modeling | CodeCode Available | 1 | 5 |
| Accelerating Vision-Language Pretraining with Free Language Modeling | Mar 24, 2023 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators | Mar 23, 2020 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| Frustratingly Simple Pretraining Alternatives to Masked Language Modeling | Sep 4, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning | May 17, 2023 | ClusteringLanguage Modeling | CodeCode Available | 1 | 5 |
| CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation | Sep 13, 2021 | DecoderDenoising | CodeCode Available | 1 | 5 |
| Eliciting Knowledge from Pretrained Language Models for Prototypical Prompt Verbalizer | Jan 14, 2022 | ClassificationContrastive Learning | CodeCode Available | 1 | 5 |
| Composable Sparse Fine-Tuning for Cross-Lingual Transfer | Oct 14, 2021 | Cross-Lingual TransferLanguage Modeling | CodeCode Available | 1 | 5 |
| KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction | Apr 15, 2021 | Dialog Relation ExtractionLanguage Modeling | CodeCode Available | 1 | 5 |
| Contextual Representation Learning beyond Masked Language Modeling | Apr 8, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| FATA-Trans: Field And Time-Aware Transformer for Sequential Tabular Data | Oct 20, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Contrastive Learning for Prompt-Based Few-Shot Language Learners | May 3, 2022 | Contrastive LearningIn-Context Learning | CodeCode Available | 1 | 5 |
| A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NER | Aug 28, 2023 | Contrastive Learningfew-shot-ner | CodeCode Available | 1 | 5 |
| CreoPep: A Universal Deep Learning Framework for Target-Specific Peptide Design and Optimization | May 5, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 | 5 |
| DomURLs_BERT: Pre-trained BERT-based Model for Malicious Domains and URLs Detection and Classification | Sep 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning | Aug 23, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 1 | 5 |