A Code-Switching Corpus of Turkish-German Conversations
2017-04-01WS 2017Unverified0· sign in to hype
{\"O}zlem {\c{C}}etino{\u{g}}lu
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We present a code-switching corpus of Turkish-German that is collected by recording conversations of bilinguals. The recordings are then transcribed in two layers following speech and orthography conventions, and annotated with sentence boundaries and intersentential, intrasentential, and intra-word switch points. The total amount of data is 5 hours of speech which corresponds to 3614 sentences. The corpus aims at serving as a resource for speech or text analysis, as well as a collection for linguistic inquiries.