wav2letter++: The Fastest Open-source Speech Recognition System

2018-12-18Code Available3· sign in to hype

Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert

Code Available — Be the first to reproduce this paper.

Code

github.com/facebookresearch/wav2letter
none★ 6,445
github.com/jakeju/wav2letter
none★ 0
github.com/flashlight/wav2letter
none★ 0
github.com/krantirk/Wav2letterPlus
none★ 0
github.com/lilei-John/wav2letter
none★ 0
github.com/gcambara/wav2letter
none★ 0
github.com/mailong25/wav2letter
none★ 0
github.com/bsridatta/wav2letter-Swedish
none★ 0

Abstract

This paper introduces wav2letter++, the fastest open-source deep learning speech recognition framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for maximum efficiency. Here we explain the architecture and design of the wav2letter++ system and compare it to other major open-source speech recognition systems. In some cases wav2letter++ is more than 2x faster than other optimized frameworks for training end-to-end neural networks for speech recognition. We also show that wav2letter++'s training times scale linearly to 64 GPUs, the highest we tested, for models with 100 million parameters. High-performance frameworks enable fast iteration, which is often a crucial factor in successful research and model tuning on new datasets and tasks.

Tasks

Speech Recognition

wav2letter++: The Fastest Open-source Speech Recognition System

Code

Abstract

Tasks

Reproductions