A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
2021-10-08Code Available0· sign in to hype
Ge Zhu, Frank Cwitkowitz, Zhiyao Duan
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/gzhu06/tdspkr-mismatch-studyOfficialIn paperpytorch★ 5
Abstract
In this paper, we conduct a cross-dataset study on parametric and non-parametric raw-waveform based speaker embeddings through speaker verification experiments. In general, we observe a more significant performance degradation of these raw-waveform systems compared to spectral based systems. We then propose two strategies to improve the performance of raw-waveform based systems on cross-dataset tests. The first strategy is to change the real-valued filters into analytic filters to ensure shift-invariance. The second strategy is to apply variational dropout to non-parametric filters to prevent them from overfitting irrelevant nuance features.