Class for the accumulators required to update the speaker vectors v_s. More...
#include <estimate-am-sgmm2.h>
Public Member Functions | |
MleSgmm2SpeakerAccs (const AmSgmm2 &model, BaseFloat rand_prune_=1.0e-05) | |
Initialize the object. Error if speaker subspace not set up. More... | |
void | Clear () |
Clear the statistics. More... | |
BaseFloat | Accumulate (const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, int32 pdf_index, BaseFloat weight, Sgmm2PerSpkDerivedVars *spk_vars) |
Accumulate statistics. Returns per-frame log-likelihood. More... | |
BaseFloat | AccumulateFromPosteriors (const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, const Matrix< BaseFloat > &posteriors, int32 pdf_index, Sgmm2PerSpkDerivedVars *spk_vars) |
Accumulate statistics, given posteriors. More... | |
void | Update (const AmSgmm2 &model, BaseFloat min_count, Vector< BaseFloat > *v_s, BaseFloat *objf_impr_out, BaseFloat *count_out) |
Update speaker vector. More... | |
Private Member Functions | |
void | UpdateNoU (Vector< BaseFloat > *v_s, BaseFloat *objf_impr_out, BaseFloat *count_out) |
void | UpdateWithU (const AmSgmm2 &model, Vector< BaseFloat > *v_s, BaseFloat *objf_impr_out, BaseFloat *count_out) |
Private Attributes | |
Vector< double > | y_s_ |
Statistics for speaker adaptation (vectors), stored per-speaker. More... | |
Vector< double > | gamma_s_ |
gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I] More... | |
Vector< double > | a_s_ |
a_i^{(s)}. For SSGMM. More... | |
std::vector< SpMatrix< double > > | H_spk_ |
The following variable does not change per speaker, it just relates to the speaker subspace. More... | |
std::vector< Matrix< double > > | NtransSigmaInv_ |
N_i^T {i}^{-1}. Needed for y^{(s)}. More... | |
BaseFloat | rand_prune_ |
small constant to randomly prune tiny posteriors More... | |
Class for the accumulators required to update the speaker vectors v_s.
Note: if you have multiple speakers you will want to initialize this just once and call Clear() after you're done with each speaker, rather than creating a new object for each speaker, since the initialization function does nontrivial work.
Definition at line 354 of file estimate-am-sgmm2.h.
MleSgmm2SpeakerAccs | ( | const AmSgmm2 & | model, |
BaseFloat | rand_prune_ = 1.0e-05 |
||
) |
Initialize the object. Error if speaker subspace not set up.
Definition at line 1713 of file estimate-am-sgmm2.cc.
References MleSgmm2SpeakerAccs::a_s_, MleSgmm2SpeakerAccs::gamma_s_, AmSgmm2::GetNtransSigmaInv(), MleSgmm2SpeakerAccs::H_spk_, AmSgmm2::HasSpeakerDependentWeights(), rnnlm::i, KALDI_ASSERT, kaldi::kTrans, AmSgmm2::N_, MleSgmm2SpeakerAccs::NtransSigmaInv_, AmSgmm2::NumGauss(), Vector< Real >::Resize(), AmSgmm2::SigmaInv_, AmSgmm2::SpkSpaceDim(), and MleSgmm2SpeakerAccs::y_s_.
BaseFloat Accumulate | ( | const AmSgmm2 & | model, |
const Sgmm2PerFrameDerivedVars & | frame_vars, | ||
int32 | pdf_index, | ||
BaseFloat | weight, | ||
Sgmm2PerSpkDerivedVars * | spk_vars | ||
) |
Accumulate statistics. Returns per-frame log-likelihood.
Definition at line 1740 of file estimate-am-sgmm2.cc.
References MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), AmSgmm2::ComponentPosteriors(), and MatrixBase< Real >::Scale().
Referenced by kaldi::AccumulateForUtterance().
BaseFloat AccumulateFromPosteriors | ( | const AmSgmm2 & | model, |
const Sgmm2PerFrameDerivedVars & | frame_vars, | ||
const Matrix< BaseFloat > & | posteriors, | ||
int32 | pdf_index, | ||
Sgmm2PerSpkDerivedVars * | spk_vars | ||
) |
Accumulate statistics, given posteriors.
Returns total count accumulated, which may differ from posteriors.Sum() due to randomized pruning.
Definition at line 1755 of file estimate-am-sgmm2.cc.
References MleSgmm2SpeakerAccs::a_s_, VectorBase< Real >::AddMatVec(), VectorBase< Real >::AddVec(), VectorBase< Real >::Dim(), AmSgmm2::FeatureDim(), MleSgmm2SpeakerAccs::gamma_s_, AmSgmm2::GetDjms(), AmSgmm2::GetSubstateMean(), Sgmm2PerFrameDerivedVars::gselect, rnnlm::i, KALDI_ASSERT, kaldi::kNoTrans, MleSgmm2SpeakerAccs::NtransSigmaInv_, AmSgmm2::NumSubstatesForPdf(), AmSgmm2::Pdf2Group(), MleSgmm2SpeakerAccs::rand_prune_, kaldi::RandPrune(), AmSgmm2::SpkSpaceDim(), AmSgmm2::w_jmi_, Sgmm2PerFrameDerivedVars::xt, and MleSgmm2SpeakerAccs::y_s_.
Referenced by MleSgmm2SpeakerAccs::Accumulate(), and kaldi::AccumulateForUtterance().
void Clear | ( | ) |
Clear the statistics.
Definition at line 1733 of file estimate-am-sgmm2.cc.
References MleSgmm2SpeakerAccs::a_s_, VectorBase< Real >::Dim(), MleSgmm2SpeakerAccs::gamma_s_, VectorBase< Real >::SetZero(), and MleSgmm2SpeakerAccs::y_s_.
Referenced by main().
void Update | ( | const AmSgmm2 & | model, |
BaseFloat | min_count, | ||
Vector< BaseFloat > * | v_s, | ||
BaseFloat * | objf_impr_out, | ||
BaseFloat * | count_out | ||
) |
Update speaker vector.
If v_s was empty, will assume it started as zero and will resize it to the speaker-subspace size.
Definition at line 1805 of file estimate-am-sgmm2.cc.
References MleSgmm2SpeakerAccs::a_s_, VectorBase< Real >::Dim(), MleSgmm2SpeakerAccs::gamma_s_, KALDI_WARN, VectorBase< Real >::Sum(), MleSgmm2SpeakerAccs::UpdateNoU(), and MleSgmm2SpeakerAccs::UpdateWithU().
Referenced by main().
|
private |
Definition at line 1826 of file estimate-am-sgmm2.cc.
References SpMatrix< Real >::AddSp(), VectorBase< Real >::CopyFromVec(), VectorBase< Real >::Dim(), MleSgmm2SpeakerAccs::gamma_s_, MleSgmm2SpeakerAccs::H_spk_, rnnlm::i, KALDI_ASSERT, KALDI_LOG, Vector< Real >::Resize(), kaldi::SolveQuadraticProblem(), VectorBase< Real >::Sum(), and MleSgmm2SpeakerAccs::y_s_.
Referenced by MleSgmm2SpeakerAccs::Update().
|
private |
Definition at line 1858 of file estimate-am-sgmm2.cc.
References MleSgmm2SpeakerAccs::a_s_, VectorBase< Real >::Add(), VectorBase< Real >::AddMatVec(), SpMatrix< Real >::AddSp(), VectorBase< Real >::AddSpVec(), VectorBase< Real >::AddVec(), SpMatrix< Real >::AddVec2(), VectorBase< Real >::ApplyExp(), VectorBase< Real >::ApplyLog(), kaldi::ApproxEqual(), VectorBase< Real >::CopyFromVec(), VectorBase< Real >::Dim(), MleSgmm2SpeakerAccs::gamma_s_, MleSgmm2SpeakerAccs::H_spk_, rnnlm::i, KALDI_ASSERT, KALDI_LOG, KALDI_WARN, kaldi::kNoTrans, VectorBase< Real >::LogSumExp(), Vector< Real >::Resize(), MatrixBase< Real >::Row(), VectorBase< Real >::Scale(), kaldi::SolveQuadraticProblem(), VectorBase< Real >::Sum(), AmSgmm2::u_, kaldi::VecSpVec(), kaldi::VecVec(), and MleSgmm2SpeakerAccs::y_s_.
Referenced by MleSgmm2SpeakerAccs::Update().
|
private |
a_i^{(s)}. For SSGMM.
Definition at line 406 of file estimate-am-sgmm2.h.
Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), MleSgmm2SpeakerAccs::Clear(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleSgmm2SpeakerAccs::Update(), and MleSgmm2SpeakerAccs::UpdateWithU().
|
private |
gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I]
Definition at line 404 of file estimate-am-sgmm2.h.
Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), MleSgmm2SpeakerAccs::Clear(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleSgmm2SpeakerAccs::Update(), MleSgmm2SpeakerAccs::UpdateNoU(), MleSgmm2SpeakerAccs::UpdateWithU(), and MleAmSgmm2Accs::~MleAmSgmm2Accs().
|
private |
The following variable does not change per speaker, it just relates to the speaker subspace.
Eq. (82): H_{i}^{spk} = N_{i}^T {i}^{-1} N_{i}
Definition at line 411 of file estimate-am-sgmm2.h.
Referenced by MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleSgmm2SpeakerAccs::UpdateNoU(), and MleSgmm2SpeakerAccs::UpdateWithU().
|
private |
N_i^T {i}^{-1}. Needed for y^{(s)}.
Definition at line 414 of file estimate-am-sgmm2.h.
Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), and MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs().
|
private |
small constant to randomly prune tiny posteriors
Definition at line 417 of file estimate-am-sgmm2.h.
Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors().
|
private |
Statistics for speaker adaptation (vectors), stored per-speaker.
Per-speaker stats for vectors, y^{(s)}. Dimension [T].
Definition at line 402 of file estimate-am-sgmm2.h.
Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), MleSgmm2SpeakerAccs::Clear(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleSgmm2SpeakerAccs::UpdateNoU(), and MleSgmm2SpeakerAccs::UpdateWithU().