SOTAVerified

Cost-effective Deployment of BERT Models in Serverless Environment

2021-06-01NAACL 2021Unverified0· sign in to hype

Marek Suppa, Katar{\'\i}na Bene{\v{s}}ov{\'a}, Andrej {\v{S}}vec

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this study, we demonstrate the viability of deploying BERT-style models to AWS Lambda in a production environment. Since the freely available pre-trained models are too large to be deployed in this environment, we utilize knowledge distillation and fine-tune the models on proprietary datasets for two real-world tasks: sentiment analysis and semantic textual similarity. As a result, we obtain models that are tuned for a specific domain and deployable in the serverless environment. The subsequent performance analysis shows that this solution does not only report latency levels acceptable for production use but that it is also a cost-effective alternative to small-to-medium size deployments of BERT models, all without any infrastructure overhead.

Tasks

Reproductions