A Code-Switching Corpus of Turkish-German Conversations

2017-04-01WS 2017Unverified0· sign in to hype

{\"O}zlem {\c{C}}etino{\u{g}}lu

Unverified — Be the first to reproduce this paper.

Abstract

We present a code-switching corpus of Turkish-German that is collected by recording conversations of bilinguals. The recordings are then transcribed in two layers following speech and orthography conventions, and annotated with sentence boundaries and intersentential, intrasentential, and intra-word switch points. The total amount of data is 5 hours of speech which corresponds to 3614 sentences. The corpus aims at serving as a resource for speech or text analysis, as well as a collection for linguistic inquiries.

Tasks

Automatic Speech Recognition (ASR)Language Identification Language Modelling Part-Of-Speech Tagging Sentence Sentiment Analysis Speech Recognition

A Code-Switching Corpus of Turkish-German Conversations

Abstract

Tasks

Reproductions