SOTAVerified

Automating the Detection of Poetic Features: The Limerick as Model Organism

2021-11-01EMNLP (LaTeCHCLfL, CLFL, LaTeCH) 2021Unverified0· sign in to hype

Almas Abdibayev, Yohei Igarashi, Allen Riddell, Daniel Rockmore

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper we take up the problem of “limerick detection” and describe a system to identify five-line poems as limericks or not. This turns out to be a surprisingly difficult challenge with many subtleties. More precisely, we produce an algorithm which focuses on the structural aspects of the limerick – rhyme scheme and rhythm (i.e., stress patterns) – and when tested on a a culled data set of 98,454 publicly available limericks, our “limerick filter” accepts 67% as limericks. The primary failure of our filter is on the detection of “non-standard” rhymes, which we highlight as an outstanding challenge in computational poetics. Our accent detection algorithm proves to be very robust. Our main contributions are (1) a novel rhyme detection algorithm that works on English words including rare proper nouns and made-up words (and thus, words not in the widely used CMUDict database); (2) a novel rhythm-identifying heuristic that is robust to language noise at moderate levels and comparable in accuracy to state-of-the-art scansion algorithms. As a third significant contribution (3) we make publicly available a large corpus of limericks that includes tags of “limerick” or “not-limerick” as determined by our identification software, thereby providing a benchmark for the community. The poetic tasks that we have identified as challenges for machines suggest that the limerick is a useful “model organism” for the study of machine capabilities in poetry and more broadly literature and language. We include a list of open challenges as well. Generally, we anticipate that this work will provide useful material and benchmarks for future explorations in the field.

Tasks

Reproductions