| Towards Lifelong Learning of Large Language Models: A Survey | Jun 10, 2024 | Continual PretrainingIncremental Learning | CodeCode Available | 2 |
| LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models | Jun 2, 2024 | Continual PretrainingInformation Retrieval | —Unverified | 0 |
| Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation Pretraining | May 30, 2024 | Continual PretrainingContrastive Learning | CodeCode Available | 1 |
| MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning | May 20, 2024 | Continual PretrainingMathematical Reasoning | CodeCode Available | 3 |
| Cross-sensor self-supervised training and alignment for remote sensing | May 16, 2024 | Continual PretrainingEarth Observation | —Unverified | 0 |
| ChuXin: 1.6B Technical Report | May 8, 2024 | Continual PretrainingLanguage Modeling | —Unverified | 0 |
| Retrieval Head Mechanistically Explains Long-Context Factuality | Apr 24, 2024 | Continual PretrainingHallucination | CodeCode Available | 3 |
| Pretraining and Updates of Domain-Specific LLM: A Case Study in the Japanese Business Domain | Apr 12, 2024 | Continual PretrainingGeneral Knowledge | —Unverified | 0 |
| CEM: A Data-Efficient Method for Large Language Models to Continue Evolving From Mistakes | Apr 11, 2024 | Continual LearningContinual Pretraining | —Unverified | 0 |
| Rho-1: Not All Tokens Are What You Need | Apr 11, 2024 | AllContinual Pretraining | CodeCode Available | 3 |