SOTAVerified

Continual Pretraining

Papers

Showing 150 of 70 papers

TitleStatusHype
Yi: Open Foundation Models by 01.AICode9
Scaling Granite Code Models to 128K ContextCode4
MoRA: High-Rank Updating for Parameter-Efficient Fine-TuningCode3
Data Engineering for Scaling Language Models to 128K ContextCode3
Rho-1: Not All Tokens Are What You NeedCode3
Retrieval Head Mechanistically Explains Long-Context FactualityCode3
Continual Pre-training of Language ModelsCode2
A Practitioner's Guide to Continual Multimodal PretrainingCode2
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical TextsCode2
Continual Training of Language Models for Few-Shot LearningCode2
Effective Long-Context Scaling of Foundation ModelsCode2
Towards Lifelong Learning of Large Language Models: A SurveyCode2
Continual Pre-Training Mitigates Forgetting in Language and VisionCode1
NyayaAnumana & INLegalLlama: The Largest Indian Legal Judgment Prediction Dataset and Specialized Language Model for Enhanced Decision AnalysisCode1
On the Robustness of Reading Comprehension Models to Entity RenamingCode1
Towards Geospatial Foundation Models via Continual PretrainingCode1
ECONET: Effective Continual Pretraining of Language Models for Event Temporal ReasoningCode1
Demystifying Domain-adaptive Post-training for Financial LLMsCode1
CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology PreservationCode1
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM PretrainingCode1
Efficient Contrastive Learning via Novel Data Augmentation and Curriculum LearningCode1
CTP:Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology PreservationCode1
Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation PretrainingCode1
Domain-Specific Language Model Pretraining for Biomedical Natural Language ProcessingCode0
A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLPCode0
Robust Data Watermarking in Language Models by Injecting Fictitious KnowledgeCode0
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via RomanizationCode0
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning ApproachCode0
Fortunately, Discourse Markers Can Enhance Language Models for Sentiment AnalysisCode0
Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM EvaluationCode0
Hierarchical Label-wise Attention Transformer Model for Explainable ICD CodingCode0
Unsupervised Domain Adaptation for Sparse Retrieval by Filling Vocabulary and Word Frequency GapsCode0
PECoP: Parameter Efficient Continual Pretraining for Action Quality AssessmentCode0
AF Adapter: Continual Pretraining for Building Chinese Biomedical Language ModelCode0
LangSAMP: Language-Script Aware Multilingual PretrainingCode0
Alchemy: Amplifying Theorem-Proving Capability through Symbolic MutationCode0
PARAMANU-AYN: Pretrain from scratch or Continual Pretraining of LLMs for Legal Domain Adaptation?0
Pretraining and Updates of Domain-Specific LLM: A Case Study in the Japanese Business Domain0
RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining0
Revisiting Pretraining with Adapters0
AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text0
The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging0
AdaPrompt: Adaptive Model Training for Prompt-based NLP0
BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM0
Bilingual Adaptation of Monolingual Foundation Models0
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models0
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content0
ChuXin: 1.6B Technical Report0
Continual Learning for Large Language Models: A Survey0
70B-parameter large language models in Japanese medical question-answering0
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DASF1 (macro)0.69Unverified
#ModelMetricClaimedVerifiedStatus
1CPTF1 - macro63.77Unverified
#ModelMetricClaimedVerifiedStatus
1DASF1 (macro)0.71Unverified