SOTAVerified

Continual Pretraining

Papers

Showing 150 of 70 papers

TitleStatusHype
Yi: Open Foundation Models by 01.AICode9
Scaling Granite Code Models to 128K ContextCode4
MoRA: High-Rank Updating for Parameter-Efficient Fine-TuningCode3
Data Engineering for Scaling Language Models to 128K ContextCode3
Rho-1: Not All Tokens Are What You NeedCode3
Retrieval Head Mechanistically Explains Long-Context FactualityCode3
Continual Pre-training of Language ModelsCode2
A Practitioner's Guide to Continual Multimodal PretrainingCode2
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical TextsCode2
Continual Training of Language Models for Few-Shot LearningCode2
Effective Long-Context Scaling of Foundation ModelsCode2
Towards Lifelong Learning of Large Language Models: A SurveyCode2
ECONET: Effective Continual Pretraining of Language Models for Event Temporal ReasoningCode1
Demystifying Domain-adaptive Post-training for Financial LLMsCode1
On the Robustness of Reading Comprehension Models to Entity RenamingCode1
CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology PreservationCode1
NyayaAnumana & INLegalLlama: The Largest Indian Legal Judgment Prediction Dataset and Specialized Language Model for Enhanced Decision AnalysisCode1
Efficient Contrastive Learning via Novel Data Augmentation and Curriculum LearningCode1
CTP:Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology PreservationCode1
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM PretrainingCode1
Continual Pre-Training Mitigates Forgetting in Language and VisionCode1
Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation PretrainingCode1
Towards Geospatial Foundation Models via Continual PretrainingCode1
Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them0
RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining0
ChuXin: 1.6B Technical Report0
Continual Learning for Large Language Models: A Survey0
70B-parameter large language models in Japanese medical question-answering0
Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective0
Cross-sensor self-supervised training and alignment for remote sensing0
Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code0
DD-TIG at Constraint@ACL2022: Multimodal Understanding and Reasoning for Role Labeling of Entities in Hateful Memes0
DoPAMine: Domain-specific Pre-training Adaptation from seed-guided data Mining0
Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language0
Enhance Mobile Agents Thinking Process Via Iterative Preference Learning0
On the Robustness of Reading Comprehension Models to Entity Renaming0
Open Generative Large Language Models for Galician0
Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling0
PARAMANU-AYN: Pretrain from scratch or Continual Pretraining of LLMs for Legal Domain Adaptation?0
Pretraining and Updates of Domain-Specific LLM: A Case Study in the Japanese Business Domain0
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models0
Revisiting Pretraining with Adapters0
AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text0
The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging0
AdaPrompt: Adaptive Model Training for Prompt-based NLP0
BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM0
Bilingual Adaptation of Monolingual Foundation Models0
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content0
Investigating Continual Pretraining in Large Language Models: Insights and Implications0
Is Domain Adaptation Worth Your Investment? Comparing BERT and FinBERT on Financial Tasks0
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DASF1 (macro)0.69Unverified
#ModelMetricClaimedVerifiedStatus
1CPTF1 - macro63.77Unverified
#ModelMetricClaimedVerifiedStatus
1DASF1 (macro)0.71Unverified