SOTAVerified

On the Effectiveness of Compact Biomedical Transformers

2022-09-07Code Available1· sign in to hype

Omid Rohanian, Mohammadmahdi Nouriborji, Samaneh Kouchaki, David A. Clifton

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Language models pre-trained on biomedical corpora, such as BioBERT, have recently shown promising results on downstream biomedical tasks. Many existing pre-trained models, on the other hand, are resource-intensive and computationally heavy owing to factors such as embedding size, hidden dimension, and number of layers. The natural language processing (NLP) community has developed numerous strategies to compress these models utilising techniques such as pruning, quantisation, and knowledge distillation, resulting in models that are considerably faster, smaller, and subsequently easier to use in practice. By the same token, in this paper we introduce six lightweight models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT, TinyBioBERT, and CompactBioBERT which are obtained either by knowledge distillation from a biomedical teacher or continual learning on the Pubmed dataset via the Masked Language Modelling (MLM) objective. We evaluate all of our models on three biomedical tasks and compare them with BioBERT-v1.1 to create efficient lightweight models that perform on par with their larger counterparts. All the models will be publicly available on our Huggingface profile at https://huggingface.co/nlpie and the codes used to run the experiments will be available at https://github.com/nlpie-research/Compact-Biomedical-Transformers.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
BC2GMBioDistilBERTF186.97Unverified
BC2GMBioMobileBERTF185.26Unverified
BC2GMDistilBioBERTF186.6Unverified
BC2GMCompactBioBERTF186.71Unverified
BC5CDR-chemicalDistilBioBERTF194.53Unverified
BC5CDR-chemicalBioDistilBERTF194.48Unverified
BC5CDR-chemicalCompactBioBERTF194.31Unverified
BC5CDR-chemicalBioMobileBERTF194.23Unverified
BC5CDR-diseaseBioDistilBERTF185.61Unverified
BC5CDR-diseaseDistilBioBERTF185.42Unverified
BC5CDR-diseaseCompactBioBERTF185.38Unverified
BC5CDR-diseaseBioMobileBERTF184.62Unverified
JNLPBABioDistilBERTF179.1Unverified
JNLPBABioMobileBERTF180.13Unverified
JNLPBADistilBioBERTF179.97Unverified
JNLPBACompactBioBERTF179.88Unverified
NCBI DiseaseCompactBioBERTF188.67Unverified
NCBI DiseaseBioMobileBERTF187.21Unverified
NCBI DiseaseBioDistilBERTF187.61Unverified
NCBI DiseaseDistilBioBERTF187.93Unverified

Reproductions