Contact
dpovey@gmail.com
Phone: 425 247 4129
(Daniel Povey)

Librispeech ASR model

The following models are provided: (i) TDNN-F based chain model based on the tdnn_1d_sp recipe, trained on 960h Librispeech data with 3x speed perturbation; (ii) Language models RNNLM trained on Librispeech trainiing transcriptions; and (iii) an i-vector extractor trained on a 200h subset of the data. We don't additionally include an LM since it can be prepared easily using the package available on OpenSLR. A tutorial on using the pre-trained model on your own data (with WSJ as an example) can be found here.

Librispeech ASR Chain 1d

Date
2020-02-03
Uploader
Desh Raj
Recipe
egs/librispeech/s5
Kaldi Version
ea6e1b7
Model Type
Speech Recognition, Factored TDNN, Chain
Error Rate
WER 3.76% on test-clean, 8.92% on test-other
Notes
Reported WER is after rescoring with large 4-gram LM (fglarge).

Librispeech language models

Date
2020-12-27
Uploader
Ke Li
Recipe
egs/librispeech/s5
Model Type
Pruned 3-gram, RNNLM
Error Rate
Perplexity 110.7 on dev (for RNNLM)

Librispeech i-vector extractor

Date
2020-02-03
Uploader
Desh Raj
Recipe
egs/librispeech/s5
Kaldi Version
ea6e1b7
Model Type
Speaker ID, i-vector