A City of Millions: Mapping Literary Social Networks At Scale
Sil Hamilton, Rebecca M. M. Hicke, David Mimno, Matthew Wilkens
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/srhm-ca/pgnOfficialIn papernone★ 0
Abstract
We release 70,509 high-quality social networks extracted from multilingual fiction and nonfiction narratives. We additionally provide metadata for 30,000 of these texts (73\% nonfiction and 27\% fiction) written between 1800 and 1999 in 58 languages. This dataset provides information on historical social worlds at an unprecedented scale, including data for 2,510,021 individuals in 2,805,482 pair-wise relationships annotated for affinity and relationship type. We achieve this scale by automating previously manual methods of extracting social networks; specifically, we adapt an existing annotation task as a language model prompt, ensuring consistency at scale with the use of structured output. This dataset serves as a unique resource for humanities and social science research by providing data on cognitive models of social realities.