SOTAVerified

Hitachi at SemEval-2022 Task 2: On the Effectiveness of Span-based Classification Approaches for Multilingual Idiomaticity Detection

2022-07-01SemEval (NAACL) 2022Unverified0· sign in to hype

Atsuki Yamaguchi, Gaku Morio, Hiroaki Ozaki, Yasuhiro Sogawa

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

In this paper, we describe our system for SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding. The task aims at detecting idiomaticity in an input sequence (Subtask A) and modeling representation of sentences that contain potential idiomatic multiword expressions (MWEs) (Subtask B) in three languages. We focus on the zero-shot setting of Subtask A and propose two span-based idiomaticity classification methods: MWE span-based classification and idiomatic MWE span prediction-based classification. We use several cross-lingual pre-trained language models (InfoXLM, XLM-R, and others) as our backbone network. Our best-performing system, fine-tuned with the span-based idiomaticity classification, ranked fifth in the zero-shot setting of Subtask A and exhibited a macro F1 score of 0.7466.

Tasks

Reproductions