| LangSAMP: Language-Script Aware Multilingual Pretraining | Sep 26, 2024 | Continual PretrainingLanguage Modeling | CodeCode Available | 0 |
| Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach | Sep 9, 2024 | Computational EfficiencyContinual Pretraining | CodeCode Available | 0 |
| A Practitioner's Guide to Continual Multimodal Pretraining | Aug 26, 2024 | Continual LearningContinual Pretraining | CodeCode Available | 2 |
| RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining | Aug 21, 2024 | Continual PretrainingCross-Lingual Transfer | —Unverified | 0 |
| Scaling Granite Code Models to 128K Context | Jul 18, 2024 | 2k4k | CodeCode Available | 4 |
| Bilingual Adaptation of Monolingual Foundation Models | Jul 13, 2024 | Continual PretrainingCross-Lingual Transfer | —Unverified | 0 |
| 70B-parameter large language models in Japanese medical question-answering | Jun 21, 2024 | Continual PretrainingDomain Adaptation | —Unverified | 0 |
| Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective | Jun 19, 2024 | BenchmarkingContinual Pretraining | —Unverified | 0 |
| Open Generative Large Language Models for Galician | Jun 19, 2024 | Continual PretrainingDiversity | —Unverified | 0 |
| BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM | Jun 17, 2024 | Continual Pretrainingzero-shot-classification | —Unverified | 0 |