| Yi: Open Foundation Models by 01.AI | Mar 7, 2024 | AttributeChatbot | CodeCode Available | 9 | 5 |
| Scaling Granite Code Models to 128K Context | Jul 18, 2024 | 2k4k | CodeCode Available | 4 | 5 |
| MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning | May 20, 2024 | Continual PretrainingMathematical Reasoning | CodeCode Available | 3 | 5 |
| Data Engineering for Scaling Language Models to 128K Context | Feb 15, 2024 | 4kContinual Pretraining | CodeCode Available | 3 | 5 |
| Rho-1: Not All Tokens Are What You Need | Apr 11, 2024 | AllContinual Pretraining | CodeCode Available | 3 | 5 |
| Retrieval Head Mechanistically Explains Long-Context Factuality | Apr 24, 2024 | Continual PretrainingHallucination | CodeCode Available | 3 | 5 |
| Continual Pre-training of Language Models | Feb 7, 2023 | Continual LearningContinual Pretraining | CodeCode Available | 2 | 5 |
| A Practitioner's Guide to Continual Multimodal Pretraining | Aug 26, 2024 | Continual LearningContinual Pretraining | CodeCode Available | 2 | 5 |
| Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts | Feb 12, 2024 | Continual PretrainingGSM8K | CodeCode Available | 2 | 5 |
| Continual Training of Language Models for Few-Shot Learning | Oct 11, 2022 | Continual LearningContinual Pretraining | CodeCode Available | 2 | 5 |
| Effective Long-Context Scaling of Foundation Models | Sep 27, 2023 | Continual PretrainingLanguage Modeling | CodeCode Available | 2 | 5 |
| Towards Lifelong Learning of Large Language Models: A Survey | Jun 10, 2024 | Continual PretrainingIncremental Learning | CodeCode Available | 2 | 5 |
| Continual Pre-Training Mitigates Forgetting in Language and Vision | May 19, 2022 | Continual LearningContinual Pretraining | CodeCode Available | 1 | 5 |
| NyayaAnumana & INLegalLlama: The Largest Indian Legal Judgment Prediction Dataset and Specialized Language Model for Enhanced Decision Analysis | Dec 11, 2024 | Continual PretrainingLanguage Modeling | CodeCode Available | 1 | 5 |
| On the Robustness of Reading Comprehension Models to Entity Renaming | Oct 16, 2021 | Continual PretrainingMachine Reading Comprehension | CodeCode Available | 1 | 5 |
| Towards Geospatial Foundation Models via Continual Pretraining | Feb 9, 2023 | Change DetectionContinual Pretraining | CodeCode Available | 1 | 5 |
| ECONET: Effective Continual Pretraining of Language Models for Event Temporal Reasoning | Dec 30, 2020 | Continual PretrainingLanguage Modelling | CodeCode Available | 1 | 5 |
| Demystifying Domain-adaptive Post-training for Financial LLMs | Jan 9, 2025 | Continual PretrainingDomain Adaptation | CodeCode Available | 1 | 5 |
| CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation | Aug 14, 2023 | Continual LearningContinual Pretraining | CodeCode Available | 1 | 5 |
| TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining | Apr 2, 2025 | Continual LearningContinual Pretraining | CodeCode Available | 1 | 5 |
| Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning | Sep 10, 2021 | Continual PretrainingContrastive Learning | CodeCode Available | 1 | 5 |
| CTP:Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation | Jan 1, 2023 | Continual LearningContinual Pretraining | CodeCode Available | 1 | 5 |
| Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation Pretraining | May 30, 2024 | Continual PretrainingContrastive Learning | CodeCode Available | 1 | 5 |
| Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing | Jul 31, 2020 | Continual Pretraining | CodeCode Available | 0 | 5 |
| A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP | May 22, 2025 | Continual PretrainingDiagnostic | CodeCode Available | 0 | 5 |
| Robust Data Watermarking in Language Models by Injecting Fictitious Knowledge | Mar 6, 2025 | Continual PretrainingMemorization | CodeCode Available | 0 | 5 |
| RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization | Jan 25, 2024 | Continual PretrainingSentiment Analysis | CodeCode Available | 0 | 5 |
| Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach | Sep 9, 2024 | Computational EfficiencyContinual Pretraining | CodeCode Available | 0 | 5 |
| Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis | Jan 6, 2022 | Continual PretrainingSentiment Analysis | CodeCode Available | 0 | 5 |
| Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM Evaluation | May 30, 2025 | Continual PretrainingFairness | CodeCode Available | 0 | 5 |
| Hierarchical Label-wise Attention Transformer Model for Explainable ICD Coding | Apr 22, 2022 | Continual Pretraining | CodeCode Available | 0 | 5 |
| Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency Gaps | Nov 8, 2022 | Continual PretrainingDomain Adaptation | CodeCode Available | 0 | 5 |
| PECoP: Parameter Efficient Continual Pretraining for Action Quality Assessment | Nov 11, 2023 | Action Quality AssessmentContinual Pretraining | CodeCode Available | 0 | 5 |
| AF Adapter: Continual Pretraining for Building Chinese Biomedical Language Model | Nov 21, 2022 | Continual PretrainingLanguage Modeling | CodeCode Available | 0 | 5 |
| LangSAMP: Language-Script Aware Multilingual Pretraining | Sep 26, 2024 | Continual PretrainingLanguage Modeling | CodeCode Available | 0 | 5 |
| Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation | Oct 21, 2024 | Automated Theorem ProvingContinual Pretraining | CodeCode Available | 0 | 5 |
| PARAMANU-AYN: Pretrain from scratch or Continual Pretraining of LLMs for Legal Domain Adaptation? | Mar 20, 2024 | Abstractive Text SummarizationContinual Pretraining | —Unverified | 0 | 0 |
| Pretraining and Updates of Domain-Specific LLM: A Case Study in the Japanese Business Domain | Apr 12, 2024 | Continual PretrainingGeneral Knowledge | —Unverified | 0 | 0 |
| RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining | Aug 21, 2024 | Continual PretrainingCross-Lingual Transfer | —Unverified | 0 | 0 |
| Revisiting Pretraining with Adapters | Aug 1, 2021 | Continual PretrainingTransfer Learning | —Unverified | 0 | 0 |
| AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text | Mar 24, 2025 | Continual PretrainingEmotion Classification | —Unverified | 0 | 0 |
| The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging | Sep 30, 2024 | Continual Pretraining | —Unverified | 0 | 0 |
| AdaPrompt: Adaptive Model Training for Prompt-based NLP | Feb 10, 2022 | Continual PretrainingLanguage Modeling | —Unverified | 0 | 0 |
| BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM | Jun 17, 2024 | Continual Pretrainingzero-shot-classification | —Unverified | 0 | 0 |
| Bilingual Adaptation of Monolingual Foundation Models | Jul 13, 2024 | Continual PretrainingCross-Lingual Transfer | —Unverified | 0 | 0 |
| Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models | Dec 10, 2024 | Continual PretrainingLanguage Modeling | —Unverified | 0 | 0 |
| Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content | Jun 25, 2025 | ArticlesContinual Pretraining | —Unverified | 0 | 0 |
| ChuXin: 1.6B Technical Report | May 8, 2024 | Continual PretrainingLanguage Modeling | —Unverified | 0 | 0 |
| Continual Learning for Large Language Models: A Survey | Feb 2, 2024 | Continual LearningContinual Pretraining | —Unverified | 0 | 0 |
| 70B-parameter large language models in Japanese medical question-answering | Jun 21, 2024 | Continual PretrainingDomain Adaptation | —Unverified | 0 | 0 |