MleAmSgmm2Accs Class Reference

Class for the accumulators associated with the phonetic-subspace model parameters. More...

#include <estimate-am-sgmm2.h>

Collaboration diagram for MleAmSgmm2Accs:

Public Member Functions

 MleAmSgmm2Accs (BaseFloat rand_prune=1.0e-05)
 
 MleAmSgmm2Accs (const AmSgmm2 &model, SgmmUpdateFlagsType flags, bool have_spk_vecs, BaseFloat rand_prune=1.0e-05)
 
 ~MleAmSgmm2Accs ()
 
void Read (std::istream &in_stream, bool binary, bool add)
 
void Write (std::ostream &out_stream, bool binary) const
 
void Check (const AmSgmm2 &model, bool show_properties=true) const
 Checks the various accumulators for correct sizes given a model. More...
 
void ResizeAccumulators (const AmSgmm2 &model, SgmmUpdateFlagsType flags, bool have_spk_vecs)
 Resizes the accumulators to the correct sizes given the model. More...
 
BaseFloat Accumulate (const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, int32 pdf_index, BaseFloat weight, Sgmm2PerSpkDerivedVars *spk_vars)
 Returns likelihood. More...
 
BaseFloat AccumulateFromPosteriors (const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, const Matrix< BaseFloat > &posteriors, int32 pdf_index, Sgmm2PerSpkDerivedVars *spk_vars)
 Returns count accumulated (may differ from posteriors.Sum() due to weight pruning). More...
 
void CommitStatsForSpk (const AmSgmm2 &model, const Sgmm2PerSpkDerivedVars &spk_vars)
 Accumulates global stats for the current speaker (if applicable). More...
 
void GetStateOccupancies (Vector< BaseFloat > *occs) const
 Accessors. More...
 
int32 FeatureDim () const
 
int32 PhoneSpaceDim () const
 
int32 NumPdfs () const
 
int32 NumGroups () const
 
int32 NumGauss () const
 

Private Member Functions

 KALDI_DISALLOW_COPY_AND_ASSIGN (MleAmSgmm2Accs)
 

Private Attributes

std::vector< Matrix< double > > Y_
 The stats which are not tied to any state. More...
 
std::vector< Matrix< double > > Z_
 Stats Z_{i} for speaker-subspace projections N. Dim is [I][D][T]. More...
 
std::vector< SpMatrix< double > > R_
 R_{i}, quadratic term for speaker subspace estimation. Dim is [I][T][T]. More...
 
std::vector< SpMatrix< double > > S_
 S_{i}^{-}, scatter of adapted feature vectors x_{i}(t). Dim is [I][D][D]. More...
 
std::vector< Matrix< double > > y_
 The SGMM state specific stats. More...
 
std::vector< Matrix< double > > gamma_
 Gaussian occupancies gamma_{jmi} for each substate and Gaussian index, pooled over groups. More...
 
std::vector< Matrix< double > > a_
 [SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities. More...
 
Matrix< double > t_
 [SSGMM] each row is one of the t_i quantities in the less-exact version of the SSGMM update for the speaker weight projections. More...
 
Vector< double > a_s_
 [SSGMM], this is a per-speaker variable storing the a_i^{(s)} quantities that we will use in order to compute the non-speaker- specific quantities [see eqs. More...
 
std::vector< SpMatrix< double > > U_
 the U_i quantities from the less-exact version of the SSGMM update for the speaker weight projections. More...
 
std::vector< Vector< double > > gamma_c_
 Sub-state occupancies gamma_{jm}^{(c)} for each sub-state. More...
 
Vector< double > gamma_s_
 gamma_{i}^{(s)}. More...
 
double total_frames_
 
double total_like_
 
int32 feature_dim_
 Dimensionality of various subspaces. More...
 
int32 phn_space_dim_
 
int32 spk_space_dim_
 
int32 num_gaussians_
 
int32 num_pdfs_
 
int32 num_groups_
 Other model specifications. More...
 
BaseFloat rand_prune_
 

Friends

class MleAmSgmm2Updater
 
class EbwAmSgmm2Updater
 

Detailed Description

Class for the accumulators associated with the phonetic-subspace model parameters.

Definition at line 119 of file estimate-am-sgmm2.h.

Constructor & Destructor Documentation

◆ MleAmSgmm2Accs() [1/2]

MleAmSgmm2Accs ( BaseFloat  rand_prune = 1.0e-05)
inlineexplicit

Definition at line 121 of file estimate-am-sgmm2.h.

122  : total_frames_(0.0), total_like_(0.0), feature_dim_(0),
124  num_pdfs_(0), num_groups_(0), rand_prune_(rand_prune) {}
int32 num_groups_
Other model specifications.
int32 feature_dim_
Dimensionality of various subspaces.

◆ MleAmSgmm2Accs() [2/2]

MleAmSgmm2Accs ( const AmSgmm2 model,
SgmmUpdateFlagsType  flags,
bool  have_spk_vecs,
BaseFloat  rand_prune = 1.0e-05 
)
inline

Definition at line 126 of file estimate-am-sgmm2.h.

129  : total_frames_(0.0), total_like_(0.0), rand_prune_(rand_prune) {
130  ResizeAccumulators(model, flags, have_spk_vecs);
131  }
void ResizeAccumulators(const AmSgmm2 &model, SgmmUpdateFlagsType flags, bool have_spk_vecs)
Resizes the accumulators to the correct sizes given the model.

◆ ~MleAmSgmm2Accs()

Definition at line 1945 of file estimate-am-sgmm2.cc.

References MleSgmm2SpeakerAccs::gamma_s_, KALDI_ERR, and VectorBase< Real >::Sum().

1945  {
1946  if (gamma_s_.Sum() != 0.0)
1947  KALDI_ERR << "In destructor of MleAmSgmm2Accs: detected that you forgot to "
1948  "call CommitStatsForSpk()";
1949 }
#define KALDI_ERR
Definition: kaldi-error.h:147
Real Sum() const
Returns sum of the elements.
Vector< double > gamma_s_
gamma_{i}^{(s)}.

Member Function Documentation

◆ Accumulate()

BaseFloat Accumulate ( const AmSgmm2 model,
const Sgmm2PerFrameDerivedVars frame_vars,
int32  pdf_index,
BaseFloat  weight,
Sgmm2PerSpkDerivedVars spk_vars 
)

Returns likelihood.

Definition at line 471 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::AccumulateFromPosteriors(), AmSgmm2::ComponentPosteriors(), count, MatrixBase< Real >::Scale(), and MleAmSgmm2Accs::total_like_.

Referenced by main().

475  {
476  // Calculate Gaussian posteriors and collect statistics
477  Matrix<BaseFloat> posteriors;
478  BaseFloat log_like = model.ComponentPosteriors(frame_vars, j2, spk_vars, &posteriors);
479  posteriors.Scale(weight);
480  BaseFloat count = AccumulateFromPosteriors(model, frame_vars, posteriors,
481  j2, spk_vars);
482  // Note: total_frames_ is incremented in AccumulateFromPosteriors().
483  total_like_ += count * log_like;
484  return log_like;
485 }
const size_t count
float BaseFloat
Definition: kaldi-types.h:29
BaseFloat AccumulateFromPosteriors(const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, const Matrix< BaseFloat > &posteriors, int32 pdf_index, Sgmm2PerSpkDerivedVars *spk_vars)
Returns count accumulated (may differ from posteriors.Sum() due to weight pruning).

◆ AccumulateFromPosteriors()

BaseFloat AccumulateFromPosteriors ( const AmSgmm2 model,
const Sgmm2PerFrameDerivedVars frame_vars,
const Matrix< BaseFloat > &  posteriors,
int32  pdf_index,
Sgmm2PerSpkDerivedVars spk_vars 
)

Returns count accumulated (may differ from posteriors.Sum() due to weight pruning).

Definition at line 487 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, MleAmSgmm2Accs::a_s_, VectorBase< Real >::AddVec(), Sgmm2PerSpkDerivedVars::b_is, VectorBase< Real >::Dim(), MleAmSgmm2Accs::feature_dim_, MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, MleAmSgmm2Accs::gamma_s_, AmSgmm2::GetDjms(), AmSgmm2::GetSubstateMean(), Sgmm2PerFrameDerivedVars::gselect, rnnlm::i, KALDI_ASSERT, AmSgmm2::NumSubstatesForGroup(), AmSgmm2::Pdf2Group(), MleAmSgmm2Accs::rand_prune_, kaldi::RandPrune(), MatrixBase< Real >::Row(), MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, MleAmSgmm2Accs::total_frames_, AmSgmm2::v_, Sgmm2PerSpkDerivedVars::v_s, AmSgmm2::w_jmi_, Sgmm2PerFrameDerivedVars::xt, Sgmm2PerFrameDerivedVars::xti, MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, MleAmSgmm2Accs::Z_, and Sgmm2PerFrameDerivedVars::zti.

Referenced by MleAmSgmm2Accs::Accumulate(), and main().

492  {
493  double tot_count = 0.0;
494  const vector<int32> &gselect = frame_vars.gselect;
495  // Intermediate variables
496  Vector<BaseFloat> gammat(gselect.size()), // sum of gammas over mix-weight.
497  a_is_part(gselect.size()); //
498  Vector<BaseFloat> xt_jmi(feature_dim_), mu_jmi(feature_dim_),
499  zt_jmi(spk_space_dim_);
500 
501  int32 j1 = model.Pdf2Group(j2);
502  int32 num_substates = model.NumSubstatesForGroup(j1);
503 
504  for (int32 m = 0; m < num_substates; m++) {
505  BaseFloat d_jms = model.GetDjms(j1, m, spk_vars);
506  BaseFloat gammat_jm = 0.0;
507  for (int32 ki = 0; ki < static_cast<int32>(gselect.size()); ki++) {
508  int32 i = gselect[ki];
509 
510  // Eq. (39): gamma_{jmi}(t) = p (j, m, i|t)
511  BaseFloat gammat_jmi = RandPrune(posteriors(ki, m), rand_prune_);
512  if (gammat_jmi == 0.0) continue;
513  gammat(ki) += gammat_jmi;
514  if (gamma_s_.Dim() != 0)
515  gamma_s_(i) += gammat_jmi;
516  gammat_jm += gammat_jmi;
517 
518  // Accumulate statistics for non-zero gaussian posteriors
519  tot_count += gammat_jmi;
520  if (!gamma_.empty()) {
521  // Eq. (40): gamma_{jmi} = \sum_t gamma_{jmi}(t)
522  gamma_[j1](m, i) += gammat_jmi;
523  }
524  if (!y_.empty()) {
525  // Eq. (41): y_{jm} = \sum_{t, i} \gamma_{jmi}(t) z_{i}(t)
526  // Suggestion: move this out of the loop over m
527  y_[j1].Row(m).AddVec(gammat_jmi, frame_vars.zti.Row(ki));
528  }
529  if (!Y_.empty()) {
530  // Eq. (42): Y_{i} = \sum_{t, j, m} \gamma_{jmi}(t) x_{i}(t) v_{jm}^T
531  Y_[i].AddVecVec(gammat_jmi, frame_vars.xti.Row(ki),
532  model.v_[j1].Row(m));
533  }
534  // Accumulate for speaker projections
535  if (!Z_.empty()) {
537  // Eq. (43): x_{jmi}(t) = x_k(t) - M{i} v_{jm}
538  model.GetSubstateMean(j1, m, i, &mu_jmi);
539  xt_jmi.CopyFromVec(frame_vars.xt);
540  xt_jmi.AddVec(-1.0, mu_jmi);
541  // Eq. (44): Z_{i} = \sum_{t, j, m} \gamma_{jmi}(t) x_{jmi}(t) v^{s}'
542  if (spk_vars->v_s.Dim() != 0) // interpret empty v_s as zero.
543  Z_[i].AddVecVec(gammat_jmi, xt_jmi, spk_vars->v_s);
544  // Eq. (49): \gamma_{i}^{(s)} = \sum_{t\in\Tau(s), j, m} gamma_{jmi}
545  // Will be used when you call CommitStatsForSpk(), to update R_.
546  }
547  } // loop over selected Gaussians
548  if (gammat_jm != 0.0) {
549  if (!a_.empty()) { // SSGMM code.
550  KALDI_ASSERT(d_jms > 0);
551  // below is eq. 40 in the MSR techreport. Caution: there
552  // was an error in the original techreport. The index i
553  // in the summation and the quantity \gamma_{jmi}^{(t)}
554  // should be differently named, e.g. i'.
555  a_[j1].Row(m).AddVec(gammat_jm / d_jms, spk_vars->b_is);
556  }
557  if (a_s_.Dim() != 0) { // [SSGMM]
558  KALDI_ASSERT(d_jms > 0);
559  KALDI_ASSERT(!model.w_jmi_.empty());
560  a_s_.AddVec(gammat_jm / d_jms, model.w_jmi_[j1].Row(m));
561  }
562  if (!gamma_c_.empty())
563  gamma_c_[j2](m) += gammat_jm;
564  }
565  } // loop over substates
566 
567  if (!S_.empty()) {
568  for (int32 ki = 0; ki < static_cast<int32>(gselect.size()); ki++) {
569  // Eq. (47): S_{i} = \sum_{t, j, m} \gamma_{jmi}(t) x_{i}(t) x_{i}(t)^T
570  if (gammat(ki) != 0.0) {
571  int32 i = gselect[ki];
572  S_[i].AddVec2(gammat(ki), frame_vars.xti.Row(ki));
573  }
574  }
575  }
576  total_frames_ += tot_count;
577  return tot_count;
578 }
std::vector< Vector< double > > gamma_c_
Sub-state occupancies gamma_{jm}^{(c)} for each sub-state.
Float RandPrune(Float post, BaseFloat prune_thresh, struct RandomState *state=NULL)
Definition: kaldi-math.h:174
kaldi::int32 int32
std::vector< SpMatrix< double > > S_
S_{i}^{-}, scatter of adapted feature vectors x_{i}(t). Dim is [I][D][D].
std::vector< Matrix< double > > gamma_
Gaussian occupancies gamma_{jmi} for each substate and Gaussian index, pooled over groups...
float BaseFloat
Definition: kaldi-types.h:29
std::vector< Matrix< double > > Y_
The stats which are not tied to any state.
std::vector< Matrix< double > > y_
The SGMM state specific stats.
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
std::vector< Matrix< double > > a_
[SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities.
Vector< double > a_s_
[SSGMM], this is a per-speaker variable storing the a_i^{(s)} quantities that we will use in order to...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
int32 feature_dim_
Dimensionality of various subspaces.
std::vector< Matrix< double > > Z_
Stats Z_{i} for speaker-subspace projections N. Dim is [I][D][T].
void AddVec(const Real alpha, const VectorBase< OtherReal > &v)
Add vector : *this = *this + alpha * rv (with casting between floats and doubles) ...
Vector< double > gamma_s_
gamma_{i}^{(s)}.

◆ Check()

void Check ( const AmSgmm2 model,
bool  show_properties = true 
) const

Checks the various accumulators for correct sizes given a model.

With wrong sizes, assertion failure occurs. When the show_properties argument is set to true, dimensions and presence/absence of the various accumulators are printed. For use when accumulators are read from file.

Definition at line 213 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, kaldi::ApproxEqual(), VectorBase< Real >::Dim(), MleAmSgmm2Accs::feature_dim_, AmSgmm2::FeatureDim(), MatrixBase< Real >::FrobeniusNorm(), MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, MleAmSgmm2Accs::gamma_s_, rnnlm::i, VectorBase< Real >::IsZero(), KALDI_ASSERT, KALDI_ERR, KALDI_LOG, KALDI_WARN, MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::num_groups_, MleAmSgmm2Accs::num_pdfs_, MatrixBase< Real >::NumCols(), AmSgmm2::NumGauss(), AmSgmm2::NumGroups(), AmSgmm2::NumPdfs(), MatrixBase< Real >::NumRows(), AmSgmm2::NumSubstatesForGroup(), AmSgmm2::NumSubstatesForPdf(), MleAmSgmm2Accs::phn_space_dim_, AmSgmm2::PhoneSpaceDim(), MleAmSgmm2Accs::R_, MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, AmSgmm2::SpkSpaceDim(), MleAmSgmm2Accs::t_, MleAmSgmm2Accs::U_, MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, and MleAmSgmm2Accs::Z_.

Referenced by main(), and TestSgmm2AccsIO().

214  {
215  if (show_properties)
216  KALDI_LOG << "Sgmm2PdfModel: J1 = " << num_groups_ << ", J2 = "
217  << num_pdfs_ << ", D = " << feature_dim_ << ", S = "
218  << phn_space_dim_ << ", T = " << spk_space_dim_ << ", I = "
219  << num_gaussians_;
220 
221  KALDI_ASSERT(num_pdfs_ == model.NumPdfs() && num_pdfs_ > 0);
222  KALDI_ASSERT(num_groups_ == model.NumGroups() && num_groups_ > 0);
223  KALDI_ASSERT(num_gaussians_ == model.NumGauss() && num_gaussians_ > 0);
224  KALDI_ASSERT(feature_dim_ == model.FeatureDim() && feature_dim_ > 0);
225  KALDI_ASSERT(phn_space_dim_ == model.PhoneSpaceDim() && phn_space_dim_ > 0);
226  KALDI_ASSERT(spk_space_dim_ == model.SpkSpaceDim());
227 
228  std::ostringstream debug_str;
229 
230  if (Y_.size() == 0) {
231  debug_str << "Y: no. ";
232  } else {
233  KALDI_ASSERT(gamma_.size() != 0);
234  KALDI_ASSERT(Y_.size() == static_cast<size_t>(num_gaussians_));
235  bool nz = false;
236  for (int32 i = 0; i < num_gaussians_; i++) {
237  KALDI_ASSERT(Y_[i].NumRows() == feature_dim_ &&
238  Y_[i].NumCols() == phn_space_dim_);
239  if (!nz && Y_[i](0, 0) != 0) { nz = true; }
240  }
241  debug_str << "Y: yes, " << string(nz ? "nonzero. " : "zero. ");
242  }
243 
244  if (Z_.size() == 0) {
245  KALDI_ASSERT(R_.size() == 0);
246  debug_str << "Z, R: no. ";
247  } else {
249  KALDI_ASSERT(Z_.size() == static_cast<size_t>(num_gaussians_));
250  KALDI_ASSERT(R_.size() == static_cast<size_t>(num_gaussians_));
251  bool Z_nz = false, R_nz = false;
252  for (int32 i = 0; i < num_gaussians_; i++) {
253  KALDI_ASSERT(Z_[i].NumRows() == feature_dim_ &&
254  Z_[i].NumCols() == spk_space_dim_);
255  KALDI_ASSERT(R_[i].NumRows() == spk_space_dim_);
256  if (!Z_nz && Z_[i](0, 0) != 0) { Z_nz = true; }
257  if (!R_nz && R_[i](0, 0) != 0) { R_nz = true; }
258  }
259  bool gamma_s_nz = !gamma_s_.IsZero();
260  debug_str << "Z: yes, " << string(Z_nz ? "nonzero. " : "zero. ");
261  debug_str << "R: yes, " << string(R_nz ? "nonzero. " : "zero. ");
262  debug_str << "gamma_s: yes, " << string(gamma_s_nz ? "nonzero. " : "zero. ");
263  }
264 
265  if (S_.size() == 0) {
266  debug_str << "S: no. ";
267  } else {
268  KALDI_ASSERT(gamma_.size() != 0);
269  bool S_nz = false;
270  KALDI_ASSERT(S_.size() == static_cast<size_t>(num_gaussians_));
271  for (int32 i = 0; i < num_gaussians_; i++) {
272  KALDI_ASSERT(S_[i].NumRows() == feature_dim_);
273  if (!S_nz && S_[i](0, 0) != 0) { S_nz = true; }
274  }
275  debug_str << "S: yes, " << string(S_nz ? "nonzero. " : "zero. ");
276  }
277 
278  if (y_.size() == 0) {
279  debug_str << "y: no. ";
280  } else {
281  KALDI_ASSERT(gamma_.size() != 0);
282  bool nz = false;
283  KALDI_ASSERT(y_.size() == static_cast<size_t>(num_groups_));
284  for (int32 j1 = 0; j1 < num_groups_; j1++) {
285  KALDI_ASSERT(y_[j1].NumRows() == model.NumSubstatesForGroup(j1));
286  KALDI_ASSERT(y_[j1].NumCols() == phn_space_dim_);
287  if (!nz && y_[j1](0, 0) != 0) { nz = true; }
288  }
289  debug_str << "y: yes, " << string(nz ? "nonzero. " : "zero. ");
290  }
291 
292  if (a_.size() == 0) {
293  debug_str << "a: no. ";
294  } else {
295  debug_str << "a: yes. ";
296  bool nz = false;
297  KALDI_ASSERT(a_.size() == static_cast<size_t>(num_groups_));
298  for (int32 j1 = 0; j1 < num_groups_; j1++) {
299  KALDI_ASSERT(a_[j1].NumRows() == model.NumSubstatesForGroup(j1) &&
300  a_[j1].NumCols() == num_gaussians_);
301  if (!nz && a_[j1].Sum() != 0) nz = true;
302  }
303  debug_str << "a: yes, " << string(nz ? "nonzero. " : "zero. "); // TODO: take out "string"
304  }
305 
306  double tot_gamma = 0.0;
307  if (gamma_.size() == 0) {
308  debug_str << "gamma: no. ";
309  } else {
310  debug_str << "gamma: yes. ";
311  KALDI_ASSERT(gamma_.size() == static_cast<size_t>(num_groups_));
312  for (int32 j1 = 0; j1 < num_groups_; j1++) {
313  KALDI_ASSERT(gamma_[j1].NumRows() == model.NumSubstatesForGroup(j1) &&
314  gamma_[j1].NumCols() == num_gaussians_);
315  tot_gamma += gamma_[j1].Sum();
316  }
317  bool nz = (tot_gamma != 0.0);
318  KALDI_ASSERT(gamma_c_.size() == num_pdfs_ && "gamma_ set up but not gamma_c_.");
319  debug_str << "gamma: yes, " << string(nz ? "nonzero. " : "zero. ");
320  }
321 
322  if (gamma_c_.size() == 0) {
323  KALDI_ERR << "gamma_c_ not set up."; // required for all accs.
324  } else {
325  KALDI_ASSERT(gamma_c_.size() == num_pdfs_);
326  double tot_gamma_c = 0.0;
327  for (int32 j2 = 0; j2 < num_pdfs_; j2++) {
328  KALDI_ASSERT(gamma_c_[j2].Dim() == model.NumSubstatesForPdf(j2));
329  tot_gamma_c += gamma_c_[j2].Sum();
330  }
331  bool nz = (tot_gamma_c != 0.0);
332  debug_str << "gamma_c: yes, " << string(nz ? "nonzero. " : "zero. ");
333  if (!gamma_.empty() && !ApproxEqual(tot_gamma_c, tot_gamma))
334  KALDI_WARN << "Counts from gamma and gamma_c differ "
335  << tot_gamma << " vs. " << tot_gamma_c;
336  }
337 
338  if (t_.NumRows() == 0) {
339  debug_str << "t: no. ";
340  } else {
341  KALDI_ASSERT(t_.NumRows() == num_gaussians_ &&
342  t_.NumCols() == spk_space_dim_);
343  KALDI_ASSERT(!U_.empty()); // t and U are used together.
344  bool nz = (t_.FrobeniusNorm() != 0);
345  debug_str << "t: yes, " << string(nz ? "nonzero. " : "zero. ");
346  }
347 
348  if (U_.size() == 0) {
349  debug_str << "U: no. ";
350  } else {
351  bool nz = false;
352  KALDI_ASSERT(U_.size() == num_gaussians_);
353  for (int32 i = 0; i < num_gaussians_; i++) {
354  if (!nz && U_[i].FrobeniusNorm() != 0) nz = true;
355  KALDI_ASSERT(U_[i].NumRows() == spk_space_dim_);
356  }
357  KALDI_ASSERT(t_.NumRows() != 0); // t and U are used together.
358  debug_str << "t: yes, " << string(nz ? "nonzero. " : "zero. ");
359  }
360 
361  if (show_properties)
362  KALDI_LOG << "Subspace GMM model properties: " << debug_str.str();
363 }
Matrix< double > t_
[SSGMM] each row is one of the t_i quantities in the less-exact version of the SSGMM update for the s...
std::vector< Vector< double > > gamma_c_
Sub-state occupancies gamma_{jm}^{(c)} for each sub-state.
bool IsZero(Real cutoff=1.0e-06) const
Returns true if matrix is all zeros.
MatrixIndexT NumCols() const
Returns number of columns (or zero for empty matrix).
Definition: kaldi-matrix.h:67
kaldi::int32 int32
std::vector< SpMatrix< double > > S_
S_{i}^{-}, scatter of adapted feature vectors x_{i}(t). Dim is [I][D][D].
std::vector< Matrix< double > > gamma_
Gaussian occupancies gamma_{jmi} for each substate and Gaussian index, pooled over groups...
int32 num_groups_
Other model specifications.
std::vector< SpMatrix< double > > U_
the U_i quantities from the less-exact version of the SSGMM update for the speaker weight projections...
std::vector< Matrix< double > > Y_
The stats which are not tied to any state.
std::vector< Matrix< double > > y_
The SGMM state specific stats.
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_WARN
Definition: kaldi-error.h:150
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
std::vector< Matrix< double > > a_
[SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities.
std::vector< SpMatrix< double > > R_
R_{i}, quadratic term for speaker subspace estimation. Dim is [I][T][T].
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
Real FrobeniusNorm() const
Frobenius norm, which is the sqrt of sum of square elements.
int32 feature_dim_
Dimensionality of various subspaces.
std::vector< Matrix< double > > Z_
Stats Z_{i} for speaker-subspace projections N. Dim is [I][D][T].
#define KALDI_LOG
Definition: kaldi-error.h:153
static bool ApproxEqual(float a, float b, float relative_tolerance=0.001)
return abs(a - b) <= relative_tolerance * (abs(a)+abs(b)).
Definition: kaldi-math.h:265
Vector< double > gamma_s_
gamma_{i}^{(s)}.

◆ CommitStatsForSpk()

void CommitStatsForSpk ( const AmSgmm2 model,
const Sgmm2PerSpkDerivedVars spk_vars 
)

Accumulates global stats for the current speaker (if applicable).

If flags contains kSgmmSpeakerProjections (N), or kSgmmSpeakerWeightProjections (u), must call this after finishing the speaker's data.

Definition at line 580 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_s_, VectorBase< Real >::AddVecVec(), MatrixBase< Real >::AddVecVec(), Sgmm2PerSpkDerivedVars::b_is, VectorBase< Real >::Dim(), MleAmSgmm2Accs::gamma_s_, rnnlm::i, VectorBase< Real >::IsZero(), MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::R_, VectorBase< Real >::SetZero(), MleAmSgmm2Accs::t_, MleAmSgmm2Accs::U_, and Sgmm2PerSpkDerivedVars::v_s.

Referenced by main().

581  {
582  const VectorBase<BaseFloat> &v_s = spk_vars.v_s;
583  if (v_s.Dim() != 0 && !v_s.IsZero() && !R_.empty()) {
584  for (int32 i = 0; i < num_gaussians_; i++)
585  // Accumulate Statistics R_{ki}
586  if (gamma_s_(i) != 0.0)
587  R_[i].AddVec2(gamma_s_(i),
588  Vector<double>(v_s));
589  }
590  if (a_s_.Dim() != 0) {
591  Vector<BaseFloat> tmp(gamma_s_);
592  // tmp(i) = gamma_s^{(i)} - a_i^{(s)} b_i^{(s)}.
593  tmp.AddVecVec(-1.0, Vector<BaseFloat>(a_s_), spk_vars.b_is, 1.0);
594  t_.AddVecVec(1.0, tmp, v_s); // eq. 53 of techreport.
595  for (int32 i = 0; i < num_gaussians_; i++) {
596  U_[i].AddVec2(a_s_(i) * spk_vars.b_is(i),
597  Vector<double>(v_s)); // eq. 54 of techreport.
598  }
599  }
600  gamma_s_.SetZero();
601  a_s_.SetZero();
602 }
Matrix< double > t_
[SSGMM] each row is one of the t_i quantities in the less-exact version of the SSGMM update for the s...
kaldi::int32 int32
std::vector< SpMatrix< double > > U_
the U_i quantities from the less-exact version of the SSGMM update for the speaker weight projections...
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
Vector< double > a_s_
[SSGMM], this is a per-speaker variable storing the a_i^{(s)} quantities that we will use in order to...
std::vector< SpMatrix< double > > R_
R_{i}, quadratic term for speaker subspace estimation. Dim is [I][T][T].
void AddVecVec(const Real alpha, const VectorBase< OtherReal > &a, const VectorBase< OtherReal > &b)
*this += alpha * a * b^T
void SetZero()
Set vector to all zeros.
Vector< double > gamma_s_
gamma_{i}^{(s)}.

◆ FeatureDim()

int32 FeatureDim ( ) const
inline

Definition at line 173 of file estimate-am-sgmm2.h.

173 { return feature_dim_; }
int32 feature_dim_
Dimensionality of various subspaces.

◆ GetStateOccupancies()

void GetStateOccupancies ( Vector< BaseFloat > *  occs) const

Accessors.

Definition at line 604 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::gamma_c_, and Vector< Real >::Resize().

Referenced by main().

604  {
605  int32 J2 = gamma_c_.size();
606  occs->Resize(J2);
607  for (int32 j2 = 0; j2 < J2; j2++) {
608  (*occs)(j2) = gamma_c_[j2].Sum();
609  }
610 }
std::vector< Vector< double > > gamma_c_
Sub-state occupancies gamma_{jm}^{(c)} for each sub-state.
kaldi::int32 int32

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( MleAmSgmm2Accs  )
private

◆ NumGauss()

int32 NumGauss ( ) const
inline

Definition at line 177 of file estimate-am-sgmm2.h.

177 { return num_gaussians_; }

◆ NumGroups()

int32 NumGroups ( ) const
inline

Definition at line 176 of file estimate-am-sgmm2.h.

176 { return num_groups_; } // returns J1
int32 num_groups_
Other model specifications.

◆ NumPdfs()

int32 NumPdfs ( ) const
inline

Definition at line 175 of file estimate-am-sgmm2.h.

175 { return num_pdfs_; } // returns J2

◆ PhoneSpaceDim()

int32 PhoneSpaceDim ( ) const
inline

Definition at line 174 of file estimate-am-sgmm2.h.

174 { return phn_space_dim_; }

◆ Read()

void Read ( std::istream &  in_stream,
bool  binary,
bool  add 
)

Definition at line 122 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, VectorBase< Real >::Dim(), kaldi::ExpectToken(), MleAmSgmm2Accs::feature_dim_, MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, MleAmSgmm2Accs::gamma_s_, rnnlm::i, KALDI_ERR, MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::num_groups_, MleAmSgmm2Accs::num_pdfs_, MleAmSgmm2Accs::phn_space_dim_, MleAmSgmm2Accs::R_, Matrix< Real >::Read(), kaldi::ReadBasicType(), kaldi::ReadToken(), Vector< Real >::Resize(), MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, MleAmSgmm2Accs::t_, MleAmSgmm2Accs::total_frames_, MleAmSgmm2Accs::total_like_, MleAmSgmm2Accs::U_, MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, and MleAmSgmm2Accs::Z_.

Referenced by main(), and TestSgmm2AccsIO().

123  {
124  ExpectToken(in_stream, binary, "<SGMMACCS>");
125  ExpectToken(in_stream, binary, "<NUMPDFS>");
126  ReadBasicType(in_stream, binary, &num_pdfs_);
127  ExpectToken(in_stream, binary, "<NUMGROUPS>");
128  ReadBasicType(in_stream, binary, &num_groups_);
129  ExpectToken(in_stream, binary, "<NUMGaussians>");
130  ReadBasicType(in_stream, binary, &num_gaussians_);
131  ExpectToken(in_stream, binary, "<FEATUREDIM>");
132  ReadBasicType(in_stream, binary, &feature_dim_);
133  ExpectToken(in_stream, binary, "<PHONESPACEDIM>");
134  ReadBasicType(in_stream, binary, &phn_space_dim_);
135  ExpectToken(in_stream, binary, "<SPKSPACEDIM>");
136  ReadBasicType(in_stream, binary, &spk_space_dim_);
137 
138  string token;
139  ReadToken(in_stream, binary, &token);
140 
141  while (token != "</SGMMACCS>") {
142  if (token == "<Y>") {
143  Y_.resize(num_gaussians_);
144  for (size_t i = 0; i < Y_.size(); i++) {
145  Y_[i].Read(in_stream, binary, add);
146  }
147  } else if (token == "<Z>") {
148  Z_.resize(num_gaussians_);
149  for (size_t i = 0; i < Z_.size(); i++) {
150  Z_[i].Read(in_stream, binary, add);
151  }
152  } else if (token == "<R>") {
153  R_.resize(num_gaussians_);
155  for (size_t i = 0; i < R_.size(); i++) {
156  R_[i].Read(in_stream, binary, add);
157  }
158  } else if (token == "<S>") {
159  S_.resize(num_gaussians_);
160  for (size_t i = 0; i < S_.size(); i++) {
161  S_[i].Read(in_stream, binary, add);
162  }
163  } else if (token == "<y>") {
164  y_.resize(num_groups_);
165  for (int32 j1 = 0; j1 < num_groups_; j1++) {
166  y_[j1].Read(in_stream, binary, add);
167  }
168  } else if (token == "<gamma>") {
169  gamma_.resize(num_groups_);
170  for (int32 j1 = 0; j1 < num_groups_; j1++) {
171  gamma_[j1].Read(in_stream, binary, add);
172  }
173  // Don't read gamma_s, it's just a temporary variable and
174  // not part of the permanent (non-speaker-specific) accs.
175  } else if (token == "<a>") {
176  a_.resize(num_groups_);
177  for (int32 j1 = 0; j1 < num_groups_; j1++) {
178  a_[j1].Read(in_stream, binary, add);
179  }
180  } else if (token == "<gamma_c>") {
181  gamma_c_.resize(num_pdfs_);
182  for (int32 j2 = 0; j2 < num_pdfs_; j2++) {
183  gamma_c_[j2].Read(in_stream, binary, add);
184  }
185  } else if (token == "<t>") {
186  t_.Read(in_stream, binary, add);
187  } else if (token == "<U>") {
188  U_.resize(num_gaussians_);
189  for (int32 i = 0; i < num_gaussians_; i++) {
190  U_[i].Read(in_stream, binary, add);
191  }
192  } else if (token == "<total_like>") {
193  double total_like;
194  ReadBasicType(in_stream, binary, &total_like);
195  if (add)
196  total_like_ += total_like;
197  else
198  total_like_ = total_like;
199  } else if (token == "<total_frames>") {
200  double total_frames;
201  ReadBasicType(in_stream, binary, &total_frames);
202  if (add)
203  total_frames_ += total_frames;
204  else
205  total_frames_ = total_frames;
206  } else {
207  KALDI_ERR << "Unexpected token '" << token << "' in model file ";
208  }
209  ReadToken(in_stream, binary, &token);
210  }
211 }
Matrix< double > t_
[SSGMM] each row is one of the t_i quantities in the less-exact version of the SSGMM update for the s...
std::vector< Vector< double > > gamma_c_
Sub-state occupancies gamma_{jm}^{(c)} for each sub-state.
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
kaldi::int32 int32
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
std::vector< SpMatrix< double > > S_
S_{i}^{-}, scatter of adapted feature vectors x_{i}(t). Dim is [I][D][D].
std::vector< Matrix< double > > gamma_
Gaussian occupancies gamma_{jmi} for each substate and Gaussian index, pooled over groups...
void Resize(MatrixIndexT length, MatrixResizeType resize_type=kSetZero)
Set vector to a specified size (can be zero).
int32 num_groups_
Other model specifications.
std::vector< SpMatrix< double > > U_
the U_i quantities from the less-exact version of the SSGMM update for the speaker weight projections...
void Read(std::istream &in, bool binary, bool add=false)
read from stream.
std::vector< Matrix< double > > Y_
The stats which are not tied to any state.
std::vector< Matrix< double > > y_
The SGMM state specific stats.
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:191
#define KALDI_ERR
Definition: kaldi-error.h:147
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
std::vector< Matrix< double > > a_
[SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities.
std::vector< SpMatrix< double > > R_
R_{i}, quadratic term for speaker subspace estimation. Dim is [I][T][T].
int32 feature_dim_
Dimensionality of various subspaces.
std::vector< Matrix< double > > Z_
Stats Z_{i} for speaker-subspace projections N. Dim is [I][D][T].
Vector< double > gamma_s_
gamma_{i}^{(s)}.

◆ ResizeAccumulators()

void ResizeAccumulators ( const AmSgmm2 model,
SgmmUpdateFlagsType  flags,
bool  have_spk_vecs 
)

Resizes the accumulators to the correct sizes given the model.

The flags argument controls which accumulators to resize.

Definition at line 365 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, MleAmSgmm2Accs::a_s_, MleAmSgmm2Accs::feature_dim_, AmSgmm2::FeatureDim(), MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, MleAmSgmm2Accs::gamma_s_, AmSgmm2::HasSpeakerDependentWeights(), rnnlm::i, KALDI_ASSERT, KALDI_ERR, kaldi::kSgmmCovarianceMatrix, kaldi::kSgmmPhoneProjections, kaldi::kSgmmPhoneVectors, kaldi::kSgmmPhoneWeightProjections, kaldi::kSgmmSpeakerProjections, kaldi::kSgmmSpeakerWeightProjections, MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::num_groups_, MleAmSgmm2Accs::num_pdfs_, AmSgmm2::NumGauss(), AmSgmm2::NumGroups(), AmSgmm2::NumPdfs(), AmSgmm2::NumSubstatesForGroup(), AmSgmm2::NumSubstatesForPdf(), MleAmSgmm2Accs::phn_space_dim_, AmSgmm2::PhoneSpaceDim(), MleAmSgmm2Accs::R_, Vector< Real >::Resize(), Matrix< Real >::Resize(), MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, AmSgmm2::SpkSpaceDim(), MleAmSgmm2Accs::t_, MleAmSgmm2Accs::total_frames_, MleAmSgmm2Accs::total_like_, MleAmSgmm2Accs::U_, MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, and MleAmSgmm2Accs::Z_.

Referenced by main().

367  {
368  num_pdfs_ = model.NumPdfs();
369  num_groups_ = model.NumGroups();
370  num_gaussians_ = model.NumGauss();
371  feature_dim_ = model.FeatureDim();
372  phn_space_dim_ = model.PhoneSpaceDim();
373  spk_space_dim_ = model.SpkSpaceDim();
375 
377  Y_.resize(num_gaussians_);
378  for (int32 i = 0; i < num_gaussians_; i++) {
379  Y_[i].Resize(feature_dim_, phn_space_dim_);
380  }
381  } else {
382  Y_.clear();
383  }
384 
386  gamma_s_.Resize(num_gaussians_);
387  } else {
388  gamma_s_.Resize(0);
389  }
390 
391  if (flags & kSgmmSpeakerProjections) {
392  if (spk_space_dim_ == 0) {
393  KALDI_ERR << "Cannot set up accumulators for speaker projections "
394  << "because speaker subspace has not been set up";
395  }
396  Z_.resize(num_gaussians_);
397  R_.resize(num_gaussians_);
398  for (int32 i = 0; i < num_gaussians_; i++) {
399  Z_[i].Resize(feature_dim_, spk_space_dim_);
400  R_[i].Resize(spk_space_dim_);
401  }
402  } else {
403  Z_.clear();
404  R_.clear();
405  }
406 
407  if (flags & kSgmmCovarianceMatrix) {
408  S_.resize(num_gaussians_);
409  for (int32 i = 0; i < num_gaussians_; i++) {
410  S_[i].Resize(feature_dim_);
411  }
412  } else {
413  S_.clear();
414  }
415 
417  kSgmmCovarianceMatrix | kSgmmPhoneProjections)) {
418  gamma_.resize(num_groups_);
419  for (int32 j1 = 0; j1 < num_groups_; j1++) {
420  gamma_[j1].Resize(model.NumSubstatesForGroup(j1), num_gaussians_);
421  }
422  } else {
423  gamma_.clear();
424  }
425 
427  && model.HasSpeakerDependentWeights() && have_spk_vecs) { // SSGMM code.
428  a_.resize(num_groups_);
429  for (int32 j1 = 0; j1 < num_groups_; j1++) {
430  a_[j1].Resize(model.NumSubstatesForGroup(j1),
432  }
433  } else {
434  a_.clear();
435  }
436 
437  if (flags & kSgmmSpeakerWeightProjections) {
438  KALDI_ASSERT(model.HasSpeakerDependentWeights() &&
439  "remove the flag \"u\" if you don't have u set up.");
440  a_s_.Resize(num_gaussians_);
441  t_.Resize(num_gaussians_, spk_space_dim_);
442  U_.resize(num_gaussians_);
443  for (int32 i = 0; i < num_gaussians_; i++)
444  U_[i].Resize(spk_space_dim_);
445  } else {
446  a_s_.Resize(0);
447  t_.Resize(0, 0);
448  U_.resize(0);
449  }
450 
451  if (true) { // always set up gamma_c_; it's nominally for
452  // estimation of substate weights, but it's also required when
453  // GetStateOccupancies() is called.
454  gamma_c_.resize(num_pdfs_);
455  for (int32 j2 = 0; j2 < num_pdfs_; j2++) {
456  gamma_c_[j2].Resize(model.NumSubstatesForPdf(j2));
457  }
458  }
459 
460 
461  if (flags & kSgmmPhoneVectors) {
462  y_.resize(num_groups_);
463  for (int32 j1 = 0; j1 < num_groups_; j1++) {
464  y_[j1].Resize(model.NumSubstatesForGroup(j1), phn_space_dim_);
465  }
466  } else {
467  y_.clear();
468  }
469 }
Matrix< double > t_
[SSGMM] each row is one of the t_i quantities in the less-exact version of the SSGMM update for the s...
std::vector< Vector< double > > gamma_c_
Sub-state occupancies gamma_{jm}^{(c)} for each sub-state.
kaldi::int32 int32
std::vector< SpMatrix< double > > S_
S_{i}^{-}, scatter of adapted feature vectors x_{i}(t). Dim is [I][D][D].
std::vector< Matrix< double > > gamma_
Gaussian occupancies gamma_{jmi} for each substate and Gaussian index, pooled over groups...
void Resize(MatrixIndexT length, MatrixResizeType resize_type=kSetZero)
Set vector to a specified size (can be zero).
int32 num_groups_
Other model specifications.
std::vector< SpMatrix< double > > U_
the U_i quantities from the less-exact version of the SSGMM update for the speaker weight projections...
t .. not really part of SGMM.
Definition: model-common.h:55
std::vector< Matrix< double > > Y_
The stats which are not tied to any state.
std::vector< Matrix< double > > y_
The SGMM state specific stats.
The letters correspond to the variable names.
Definition: model-common.h:48
#define KALDI_ERR
Definition: kaldi-error.h:147
std::vector< Matrix< double > > a_
[SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities.
Vector< double > a_s_
[SSGMM], this is a per-speaker variable storing the a_i^{(s)} quantities that we will use in order to...
std::vector< SpMatrix< double > > R_
R_{i}, quadratic term for speaker subspace estimation. Dim is [I][T][T].
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
int32 feature_dim_
Dimensionality of various subspaces.
std::vector< Matrix< double > > Z_
Stats Z_{i} for speaker-subspace projections N. Dim is [I][D][T].
void Resize(const MatrixIndexT r, const MatrixIndexT c, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Sets matrix to a specified size (zero is OK as long as both r and c are zero).
Vector< double > gamma_s_
gamma_{i}^{(s)}.

◆ Write()

void Write ( std::ostream &  out_stream,
bool  binary 
) const

Definition at line 34 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, MleAmSgmm2Accs::feature_dim_, MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, rnnlm::i, KALDI_ASSERT, MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::num_groups_, MleAmSgmm2Accs::num_pdfs_, MatrixBase< Real >::NumRows(), MleAmSgmm2Accs::phn_space_dim_, MleAmSgmm2Accs::R_, MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, MleAmSgmm2Accs::t_, MleAmSgmm2Accs::total_frames_, MleAmSgmm2Accs::total_like_, MleAmSgmm2Accs::U_, MatrixBase< Real >::Write(), kaldi::WriteBasicType(), kaldi::WriteToken(), MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, and MleAmSgmm2Accs::Z_.

Referenced by main(), and TestSgmm2AccsIO().

34  {
35 
36  WriteToken(out_stream, binary, "<SGMMACCS>");
37  WriteToken(out_stream, binary, "<NUMPDFS>");
38  WriteBasicType(out_stream, binary, num_pdfs_);
39  WriteToken(out_stream, binary, "<NUMGROUPS>");
40  WriteBasicType(out_stream, binary, num_groups_);
41  WriteToken(out_stream, binary, "<NUMGaussians>");
42  WriteBasicType(out_stream, binary, num_gaussians_);
43  WriteToken(out_stream, binary, "<FEATUREDIM>");
44  WriteBasicType(out_stream, binary, feature_dim_);
45  WriteToken(out_stream, binary, "<PHONESPACEDIM>");
46  WriteBasicType(out_stream, binary, phn_space_dim_);
47  WriteToken(out_stream, binary, "<SPKSPACEDIM>");
48  WriteBasicType(out_stream, binary, spk_space_dim_);
49  if (!binary) out_stream << "\n";
50 
51  if (Y_.size() != 0) {
52  KALDI_ASSERT(gamma_.size() != 0);
53  WriteToken(out_stream, binary, "<Y>");
54  for (int32 i = 0; i < num_gaussians_; i++) {
55  Matrix<BaseFloat>(Y_[i]).Write(out_stream, binary);
56  }
57  }
58  if (Z_.size() != 0) {
59  KALDI_ASSERT(R_.size() != 0);
60  WriteToken(out_stream, binary, "<Z>");
61  for (int32 i = 0; i < num_gaussians_; i++) {
62  Matrix<BaseFloat>(Z_[i]).Write(out_stream, binary);
63  }
64  WriteToken(out_stream, binary, "<R>");
65  for (int32 i = 0; i < num_gaussians_; i++) {
66  SpMatrix<BaseFloat>(R_[i]).Write(out_stream, binary);
67  }
68  }
69  if (S_.size() != 0) {
70  KALDI_ASSERT(gamma_.size() != 0);
71  WriteToken(out_stream, binary, "<S>");
72  for (int32 i = 0; i < num_gaussians_; i++) {
73  SpMatrix<BaseFloat>(S_[i]).Write(out_stream, binary);
74  }
75  }
76  if (y_.size() != 0) {
77  KALDI_ASSERT(gamma_.size() != 0);
78  WriteToken(out_stream, binary, "<y>");
79  for (int32 j1 = 0; j1 < num_groups_; j1++) {
80  Matrix<BaseFloat>(y_[j1]).Write(out_stream, binary);
81  }
82  }
83  if (gamma_.size() != 0) { // These stats are large
84  // -> write as single precision.
85  WriteToken(out_stream, binary, "<gamma>");
86  for (int32 j1 = 0; j1 < num_groups_; j1++) {
87  Matrix<BaseFloat> gamma_j1(gamma_[j1]);
88  gamma_j1.Write(out_stream, binary);
89  }
90  }
91  if (t_.NumRows() != 0) {
92  WriteToken(out_stream, binary, "<t>");
93  Matrix<BaseFloat>(t_).Write(out_stream, binary);
94  }
95  if (U_.size() != 0) {
96  WriteToken(out_stream, binary, "<U>");
97  for (int32 i = 0; i < num_gaussians_; i++) {
98  SpMatrix<BaseFloat>(U_[i]).Write(out_stream, binary);
99  }
100  }
101  if (gamma_c_.size() != 0) {
102  WriteToken(out_stream, binary, "<gamma_c>");
103  for (int32 j2 = 0; j2 < num_pdfs_; j2++) {
104  Vector<BaseFloat>(gamma_c_[j2]).Write(out_stream, binary);
105  }
106  }
107  if (a_.size() != 0) {
108  WriteToken(out_stream, binary, "<a>");
109  for (int32 j1 = 0; j1 < num_groups_; j1++) {
110  Matrix<BaseFloat>(a_[j1]).Write(out_stream, binary);
111  }
112  }
113  WriteToken(out_stream, binary, "<total_like>");
114  WriteBasicType(out_stream, binary, total_like_);
115 
116  WriteToken(out_stream, binary, "<total_frames>");
117  WriteBasicType(out_stream, binary, total_frames_);
118 
119  WriteToken(out_stream, binary, "</SGMMACCS>");
120 }
Matrix< double > t_
[SSGMM] each row is one of the t_i quantities in the less-exact version of the SSGMM update for the s...
std::vector< Vector< double > > gamma_c_
Sub-state occupancies gamma_{jm}^{(c)} for each sub-state.
kaldi::int32 int32
std::vector< SpMatrix< double > > S_
S_{i}^{-}, scatter of adapted feature vectors x_{i}(t). Dim is [I][D][D].
std::vector< Matrix< double > > gamma_
Gaussian occupancies gamma_{jmi} for each substate and Gaussian index, pooled over groups...
int32 num_groups_
Other model specifications.
std::vector< SpMatrix< double > > U_
the U_i quantities from the less-exact version of the SSGMM update for the speaker weight projections...
std::vector< Matrix< double > > Y_
The stats which are not tied to any state.
std::vector< Matrix< double > > y_
The SGMM state specific stats.
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void Write(std::ostream &out_stream, bool binary) const
std::vector< Matrix< double > > a_
[SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities.
std::vector< SpMatrix< double > > R_
R_{i}, quadratic term for speaker subspace estimation. Dim is [I][T][T].
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
int32 feature_dim_
Dimensionality of various subspaces.
std::vector< Matrix< double > > Z_
Stats Z_{i} for speaker-subspace projections N. Dim is [I][D][T].
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Friends And Related Function Documentation

◆ EbwAmSgmm2Updater

friend class EbwAmSgmm2Updater
friend

Definition at line 240 of file estimate-am-sgmm2.h.

◆ MleAmSgmm2Updater

friend class MleAmSgmm2Updater
friend

Definition at line 239 of file estimate-am-sgmm2.h.

Member Data Documentation

◆ a_

std::vector< Matrix<double> > a_
private

[SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities.

They're needed to estimate the v_{jm} and w_i quantities in the symmetric SGMM. Dimension is [J1][#mix][S]

Definition at line 200 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), MleAmSgmm2Updater::ComputeLogA(), EbwAmSgmm2Updater::ComputePhoneVecStats(), MleAmSgmm2Accs::Read(), MleAmSgmm2Accs::ResizeAccumulators(), MleAmSgmm2Updater::Update(), and MleAmSgmm2Accs::Write().

◆ a_s_

Vector<double> a_s_
private

[SSGMM], this is a per-speaker variable storing the a_i^{(s)} quantities that we will use in order to compute the non-speaker- specific quantities [see eqs.

53 and 54 in techreport]. Note: there is a separate variable a_s_ in class MleSgmm2SpeakerAccs, which is the same thing but for purposes of computing the speaker-vector v^{(s)}.

Definition at line 213 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::CommitStatsForSpk(), and MleAmSgmm2Accs::ResizeAccumulators().

◆ feature_dim_

◆ gamma_

◆ gamma_c_

std::vector< Vector<double> > gamma_c_
private

Sub-state occupancies gamma_{jm}^{(c)} for each sub-state.

In the SCTM version of the SGMM, for compactness we store two separate sets of gamma statistics, one to estimate the v_{jm} quantities and one to estimate the sub-state weights c_{jm}.

Definition at line 223 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), MleAmSgmm2Accs::GetStateOccupancies(), MleAmSgmm2Accs::Read(), MleAmSgmm2Accs::ResizeAccumulators(), EbwAmSgmm2Updater::UpdateSubstateWeights(), MleAmSgmm2Updater::UpdateSubstateWeights(), and MleAmSgmm2Accs::Write().

◆ gamma_s_

Vector<double> gamma_s_
private

gamma_{i}^{(s)}.

Per-speaker counts for each Gaussian. Dimension is [I] Needed for stats R_. This can be viewed as a temporary variable; it does not form part of the stats that we eventually dump to disk.

Definition at line 228 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), MleAmSgmm2Accs::CommitStatsForSpk(), MleAmSgmm2Accs::Read(), and MleAmSgmm2Accs::ResizeAccumulators().

◆ num_gaussians_

◆ num_groups_

◆ num_pdfs_

◆ phn_space_dim_

◆ R_

std::vector< SpMatrix<double> > R_
private

◆ rand_prune_

BaseFloat rand_prune_
private

Definition at line 236 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors().

◆ S_

std::vector< SpMatrix<double> > S_
private

◆ spk_space_dim_

◆ t_

Matrix<double> t_
private

[SSGMM] each row is one of the t_i quantities in the less-exact version of the SSGMM update for the speaker weight projections.

Dimension is [I][T]

Definition at line 205 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::Check(), MleAmSgmm2Accs::CommitStatsForSpk(), MleAmSgmm2Accs::Read(), MleAmSgmm2Accs::ResizeAccumulators(), EbwAmSgmm2Updater::UpdateU(), MleAmSgmm2Updater::UpdateU(), and MleAmSgmm2Accs::Write().

◆ total_frames_

◆ total_like_

◆ U_

std::vector<SpMatrix<double> > U_
private

the U_i quantities from the less-exact version of the SSGMM update for the speaker weight projections.

Dimension is [I][T][T]

Definition at line 217 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::Check(), MleAmSgmm2Accs::CommitStatsForSpk(), MleAmSgmm2Accs::Read(), MleAmSgmm2Accs::ResizeAccumulators(), EbwAmSgmm2Updater::UpdateU(), MleAmSgmm2Updater::UpdateU(), and MleAmSgmm2Accs::Write().

◆ Y_

std::vector< Matrix<double> > Y_
private

◆ y_

std::vector< Matrix<double> > y_
private

◆ Z_

std::vector< Matrix<double> > Z_
private

The documentation for this class was generated from the following files: