Chime 6 Models

This resource contains pretrained models for the Chime 6 challenge, including models for the baseline and the JHU-CLSP submission. There is a separate package for the speech activity detection (SAD), speaker diarization, and automatic speech recognition (ASR) components. Contact Chime 6 organizer Shinji Watanabe (email: shinjiw@ieee.org) for questions about this resource.

Chime 6 SAD system

Download

Date: 2019-11-18
Uploader: Desh Raj
Recipe: None
Model Type: Speech Activity Detection (SAD), TDNN + Stats pooling
Error Rate: 5.1% error rate on Chime 6 dev
Notes: Trained on the Chime 6 training data

Chime 6 Baseline Diarization system

Download

Date: 2019-11-18
Uploader: David Snyder
Recipe: None
Model Type: Diarization, x-vector
Error Rate: 61.6% DER on Chime 6 dev (using baseline SAD)
Notes: Extractor trained on reverberated Voxceleb, backend trained on Chime 6 training data

Chime 6 Baseline ASR system

Download

Date: 2019-11-18
Uploader: Ashish Arora
Recipe: None
Model Type: ASR, TDNN-F, chain
Error Rate: 51.8% WER on Chime 6 dev (track 1 conditions)
Notes: Trained on Chime 6 training data with augmentations

JHU-CLSP Diarization system

Download

Date: 2021-04-14
Uploader: Desh Raj
Recipe: None
Model Type: Diarization, x-vector, i-vector, VB resegmentation
Error Rate: 51.0% DER on Chime 6 dev (using baseline SAD)
Notes: Same x-vector extractor as baseline, i-vector extractor for VB resegmentation trained on challenge data (see stage -1 in s5b_track2/run.sh)

JHU-CLSP ASR system

Download

Date: 2019-04-14
Uploader: Ashish Arora
Recipe: None
Model Type: ASR, TDNN-F, chain, RNNLM
Error Rate: 43.3% WER on Chime 6 dev (track 1 conditions)
Notes: CNN-TDNNF model trained with augmentations and RNNLM rescoring