The PyTorch-Kaldi Speech Recognition Toolkit

2018-11-19Code Available1· sign in to hype

Mirco Ravanelli, Titouan Parcollet, Yoshua Bengio

Code Available — Be the first to reproduce this paper.

Code

github.com/wponghiran/imp-snns-for-sl
pytorch★ 11
github.com/walterheymans/pytorch-kaldi-gan
pytorch★ 0
github.com/ayyucedemirbas/gcommands_12_classes
pytorch★ 0
github.com/xpz123/pytorch-kaldi
pytorch★ 0
github.com/Baileyswu/pytorch-hmm-vae
pytorch★ 0
github.com/amikey/pytorch-kaldi
pytorch★ 0
github.com/PeiyanFlying/https-github.com-PeiyanFlying-pytorch-kaldi
pytorch★ 0
github.com/yrhvivian/pytorch-kaldi
pytorch★ 0
github.com/Dahee96/Seq2seq-
pytorch★ 0
github.com/NOEPG/pytorch-kaldi
pytorch★ 0

Abstract

The availability of open-source software is playing a remarkable role in the popularization of speech recognition and deep learning. Kaldi, for instance, is nowadays an established framework used to develop state-of-the-art speech recognizers. PyTorch is used to build neural networks with the Python language and has recently spawn tremendous interest within the machine learning community thanks to its simplicity and flexibility. The PyTorch-Kaldi project aims to bridge the gap between these popular toolkits, trying to inherit the efficiency of Kaldi and the flexibility of PyTorch. PyTorch-Kaldi is not only a simple interface between these software, but it embeds several useful features for developing modern speech recognizers. For instance, the code is specifically designed to naturally plug-in user-defined acoustic models. As an alternative, users can exploit several pre-implemented neural networks that can be customized using intuitive configuration files. PyTorch-Kaldi supports multiple feature and label streams as well as combinations of neural networks, enabling the use of complex neural architectures. The toolkit is publicly-released along with a rich documentation and is designed to properly work locally or on HPC clusters. Experiments, that are conducted on several datasets and tasks, show that PyTorch-Kaldi can effectively be used to develop modern state-of-the-art speech recognizers.

Tasks

Distant Speech Recognition Noisy Speech Recognition Speech Recognition

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
LibriSpeech test-clean	Li-GRU	Word Error Rate (WER)	6.2	—	Unverified
TIMIT	LSTM + Dropout + BatchNorm + Monophone Reg	Percentage error	14.5	—	Unverified
TIMIT	GRU + Dropout + BatchNorm + Monophone Reg	Percentage error	14.9	—	Unverified
TIMIT	RNN + Dropout + BatchNorm + Monophone Reg	Percentage error	15.9	—	Unverified
TIMIT	LiGRU + Dropout + BatchNorm + Monophone Reg	Percentage error	14.2	—	Unverified
TIMIT	Li-GRU	Percentage error	16.3	—	Unverified
TIMIT	RNN	Percentage error	16.5	—	Unverified
TIMIT	GRU	Percentage error	16.6	—	Unverified
TIMIT	LSTM	Percentage error	16	—	Unverified

The PyTorch-Kaldi Speech Recognition Toolkit

Code

Abstract

Tasks

Benchmark Results

Reproductions