SOTAVerified

Constructing a Public Meeting Corpus

2020-05-01LREC 2020Unverified0· sign in to hype

Koji Tanaka, Chenhui Chu, Haolin Ren, Benjamin Renoust, Yuta Nakashima, Noriko Takemura, Hajime Nagahara, Takao Fujikawa

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we propose a full pipeline of analysis of a large corpus about a century of public meeting in historical Australian news papers, from construction to visual exploration. The corpus construction method is based on image processing and OCR. We digitize and transcribe texts of the specific topic of public meeting. Experiments show that our proposed method achieves a F-score of 87.8\% for corpus construction. As a result, we built a content search tool for temporal and semantic content analysis.

Tasks

Reproductions