SOTAVerified

The Russian Legislative Corpus

2024-06-07Code Available2· sign in to hype

Denis Saveliev, Ruslan Kuchakov

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We present the comprehensive Russian primary and secondary legislation corpus covering 1991 to 2023. The corpus collects all 281,413 texts (176,523,268 tokens) of non-secret federal regulations and acts, along with their metadata. The corpus has two versions the original text with minimal preprocessing and a version prepared for linguistic analysis with morphosyntactic markup.

Reproductions