A Hybrid Architecture for Out of Domain Intent Detection and Intent Discovery
Masoud Akbari, Ali Mohades, M. Hassan Shirali-Shahreza
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/Makbari1997/VAE-KPCA-HDBSCANOfficialtf★ 11
Abstract
Intent Detection is one of the tasks of the Natural Language Understanding (NLU) unit in task-oriented dialogue systems. Out of Scope (OOS) and Out of Domain (OOD) inputs may run these systems into a problem. On the other side, a labeled dataset is needed to train a model for Intent Detection in task-oriented dialogue systems. The creation of a labeled dataset is time-consuming and needs human resources. The purpose of this article is to address mentioned problems. The task of identifying OOD/OOS inputs is named OOD/OOS Intent Detection. Also, discovering new intents and pseudo-labeling of OOD inputs is well known by Intent Discovery. In OOD intent detection part, we make use of a Variational Autoencoder to distinguish between known and unknown intents independent of input data distribution. After that, an unsupervised clustering method is used to discover different unknown intents underlying OOD/OOS inputs. We also apply a non-linear dimensionality reduction on OOD/OOS representations to make distances between representations more meaning full for clustering. Our results show that the proposed model for both OOD/OOS Intent Detection and Intent Discovery achieves great results and passes baselines in English and Persian languages.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| ATIS | k-PCA + HDBSCAN | ARI | 74.94 | — | Unverified |
| Persian-ATIS | k-PCA + HDBSCAN | ARI | 11.97 | — | Unverified |
| SNIPS | k-PCA + HDBSCAN | ARI | 59.23 | — | Unverified |