SOTAVerified

Continual Pretraining

Papers

Showing 110 of 70 papers

TitleStatusHype
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content0
LLaVA-c: Continual Improved Visual Instruction Tuning0
Simulating Training Data Leakage in Multiple-Choice Benchmarks for LLM EvaluationCode0
A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLPCode0
Enhance Mobile Agents Thinking Process Via Iterative Preference Learning0
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning0
Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language0
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM PretrainingCode1
Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them0
Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling0
Show:102550
← PrevPage 1 of 7Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1DASF1 (macro)0.69Unverified