SOTAVerified

Continual Pretraining

Papers

Showing 150 of 70 papers

TitleStatusHype
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content0
LLaVA-c: Continual Improved Visual Instruction Tuning0
Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM EvaluationCode0
A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLPCode0
Enhance Mobile Agents Thinking Process Via Iterative Preference Learning0
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning0
Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language0
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM PretrainingCode1
Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them0
Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling0
AfroXLMR-Social: Adapting Pre-trained Language Models for African Languages Social Media Text0
Robust Data Watermarking in Language Models by Injecting Fictitious KnowledgeCode0
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study0
Demystifying Domain-adaptive Post-training for Financial LLMsCode1
NyayaAnumana & INLegalLlama: The Largest Indian Legal Judgment Prediction Dataset and Specialized Language Model for Enhanced Decision AnalysisCode1
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models0
Alchemy: Amplifying Theorem-Proving Capability through Symbolic MutationCode0
DoPAMine: Domain-specific Pre-training Adaptation from seed-guided data Mining0
The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging0
AstroMLab 2: AstroLLaMA-2-70B Model and Benchmarking Specialised LLMs for Astronomy0
LangSAMP: Language-Script Aware Multilingual PretrainingCode0
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning ApproachCode0
A Practitioner's Guide to Continual Multimodal PretrainingCode2
RedWhale: An Adapted Korean LLM Through Efficient Continual Pretraining0
Scaling Granite Code Models to 128K ContextCode4
Bilingual Adaptation of Monolingual Foundation Models0
70B-parameter large language models in Japanese medical question-answering0
Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective0
Open Generative Large Language Models for Galician0
BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pretraining of BabyLM0
Towards Lifelong Learning of Large Language Models: A SurveyCode2
LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models0
Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation PretrainingCode1
MoRA: High-Rank Updating for Parameter-Efficient Fine-TuningCode3
Cross-sensor self-supervised training and alignment for remote sensing0
ChuXin: 1.6B Technical Report0
Retrieval Head Mechanistically Explains Long-Context FactualityCode3
Pretraining and Updates of Domain-Specific LLM: A Case Study in the Japanese Business Domain0
CEM: A Data-Efficient Method for Large Language Models to Continue Evolving From Mistakes0
Rho-1: Not All Tokens Are What You NeedCode3
Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code0
PARAMANU-AYN: Pretrain from scratch or Continual Pretraining of LLMs for Legal Domain Adaptation?0
Yi: Open Foundation Models by 01.AICode9
Investigating Continual Pretraining in Large Language Models: Insights and Implications0
Data Engineering for Scaling Language Models to 128K ContextCode3
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical TextsCode2
Continual Learning for Large Language Models: A Survey0
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via RomanizationCode0
PECoP: Parameter Efficient Continual Pretraining for Action Quality AssessmentCode0
Effective Long-Context Scaling of Foundation ModelsCode2
Show:102550
← PrevPage 1 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DASF1 (macro)0.69Unverified
#ModelMetricClaimedVerifiedStatus
1CPTF1 - macro63.77Unverified
#ModelMetricClaimedVerifiedStatus
1DASF1 (macro)0.71Unverified