Deep Learning Approaches for Image Retrieval and Pattern Spotting in Ancient Documents
Kelly Lais Wiggers, Alceu de Souza Britto Junior, Alessandro Lameiras Koerich, Laurent Heutte, Luiz Eduardo Soares de Oliveira
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This paper describes two approaches for content-based image retrieval and pattern spotting in document images using deep learning. The first approach uses a pre-trained CNN model to cope with the lack of training data, which is fine-tuned to achieve a compact yet discriminant representation of queries and image candidates. The second approach uses a Siamese Convolution Neural Network trained on a previously prepared subset of image pairs from the ImageNet dataset to provide the similarity-based feature maps. In both methods, the learned representation scheme considers feature maps of different sizes which are evaluated in terms of retrieval performance. A robust experimental protocol using two public datasets (Tobacoo-800 and DocExplore) has shown that the proposed methods compare favorably against state-of-the-art document image retrieval and pattern spotting methods.