To show or not to show: Redacting sensitive text from videos of electronic displays

2022-08-19Unverified0· sign in to hype

Abhishek Mukhopadhyay, Shubham Agarwal, Patrick Dylan Zwick, Pradipta Biswas

Unverified — Be the first to reproduce this paper.

Abstract

With the increasing prevalence of video recordings there is a growing need for tools that can maintain the privacy of those recorded. In this paper, we define an approach for redacting personally identifiable text from videos using a combination of optical character recognition (OCR) and natural language processing (NLP) techniques. We examine the relative performance of this approach when used with different OCR models, specifically Tesseract and the OCR system from Google Cloud Vision (GCV). For the proposed approach the performance of GCV, in both accuracy and speed, is significantly higher than Tesseract. Finally, we explore the advantages and disadvantages of both models in real-world applications.

Tasks

Optical Character Recognition Optical Character Recognition (OCR)

To show or not to show: Redacting sensitive text from videos of electronic displays

Abstract

Tasks

Reproductions