StackMix and Blot Augmentations for Handwritten Text Recognition

2021-08-26Code Available1· sign in to hype

Alex Shonenkov, Denis Karachev, Maxim Novopoltsev, Mark Potanin, Denis Dimitrov

Code Available — Be the first to reproduce this paper.

Code

github.com/sberbank-ai/StackMix-OCR
Officialpytorch★ 48

Abstract

This paper proposes a handwritten text recognition(HTR) system that outperforms current state-of-the-artmethods. The comparison was carried out on three of themost frequently used in HTR task datasets, namely Ben-tham, IAM, and Saint Gall. In addition, the results on tworecently presented datasets, Peter the Greats manuscriptsand HKR Dataset, are provided.The paper describes the architecture of the neural net-work and two ways of increasing the volume of train-ing data: augmentation that simulates strikethrough text(HandWritten Blots) and a new text generation method(StackMix), which proved to be very effective in HTR tasks.StackMix can also be applied to the standalone task of gen-erating handwritten text based on printed text.

Tasks

Data Augmentation Handwritten Text Recognition HTR Text Generation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Bentham	StackMix+Blots	CER	1.73	—	Unverified
Digital Peter	StackMix+Blots	CER	2.5	—	Unverified
HKR	StackMix+Blots	CER	3.49	—	Unverified
IAM-B	StackMix+Blots	CER	3.77	—	Unverified
IAM-D	StackMix+Blots	CER	3.01	—	Unverified
Saint Gall	StackMix+Blots	CER	3.65	—	Unverified

StackMix and Blot Augmentations for Handwritten Text Recognition

Code

Abstract

Tasks

Benchmark Results

Reproductions