Chime 6 Models
This resource contains pretrained models for the Chime 6 challenge, including models for the baseline and the JHU-CLSP submission. There is a separate package for the speech activity detection (SAD), speaker diarization, and automatic speech recognition (ASR) components. Contact Chime 6 organizer Shinji Watanabe (email: shinjiw@ieee.org) for questions about this resource.
Chime 6 SAD system
- Date
- 2019-11-18
- Uploader
- Desh Raj
- Recipe
- None
- Model Type
- Speech Activity Detection (SAD), TDNN + Stats pooling
- Error Rate
- 5.1% error rate on Chime 6 dev
- Notes
- Trained on the Chime 6 training data
Chime 6 Baseline Diarization system
- Date
- 2019-11-18
- Uploader
- David Snyder
- Recipe
- None
- Model Type
- Diarization, x-vector
- Error Rate
- 61.6% DER on Chime 6 dev (using baseline SAD)
- Notes
- Extractor trained on reverberated Voxceleb, backend trained on Chime 6 training data
Chime 6 Baseline ASR system
- Date
- 2019-11-18
- Uploader
- Ashish Arora
- Recipe
- None
- Model Type
- ASR, TDNN-F, chain
- Error Rate
- 51.8% WER on Chime 6 dev (track 1 conditions)
- Notes
- Trained on Chime 6 training data with augmentations
JHU-CLSP Diarization system
- Date
- 2021-04-14
- Uploader
- Desh Raj
- Recipe
- None
- Model Type
- Diarization, x-vector, i-vector, VB resegmentation
- Error Rate
- 51.0% DER on Chime 6 dev (using baseline SAD)
- Notes
- Same x-vector extractor as baseline, i-vector extractor for VB resegmentation trained on challenge data (see stage -1 in s5b_track2/run.sh)
JHU-CLSP ASR system
- Date
- 2019-04-14
- Uploader
- Ashish Arora
- Recipe
- None
- Model Type
- ASR, TDNN-F, chain, RNNLM
- Error Rate
- 43.3% WER on Chime 6 dev (track 1 conditions)
- Notes
- CNN-TDNNF model trained with augmentations and RNNLM rescoring