SOTAVerified

TR-SEQ: Named Entity Recognition Dataset for Turkish Search Engine Queries

2021-09-01RANLP 2021Unverified0· sign in to hype

Berkay Topçu, İlknur Durgar El-Kahlout

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Recognizing named entities in short search engine queries is a difficult task due to their weaker contextual information compared to long sentences. Standard named entity recognition (NER) systems that are trained on grammatically correct and long sentences fail to perform well on such queries. In this study, we share our efforts towards creating a cleaned and labeled dataset of real Turkish search engine queries (TR-SEQ) and introduce an extended label set to satisfy the search engine needs. A NER system is trained by applying the state-of-the-art deep learning method BERT to the collected data and its high performance on search engine queries is reported. Moreover, we compare our results with the state-of-the-art Turkish NER systems.

Tasks

Reproductions