Class for the accumulators associated with the phonetic-subspace model parameters. More...

#include <estimate-am-sgmm2.h>

Collaboration diagram for MleAmSgmm2Accs:

[legend]

Public Member Functions
	MleAmSgmm2Accs (BaseFloat rand_prune=1.0e-05)

	MleAmSgmm2Accs (const AmSgmm2 &model, SgmmUpdateFlagsType flags, bool have_spk_vecs, BaseFloat rand_prune=1.0e-05)

	~MleAmSgmm2Accs ()

void	Read (std::istream &in_stream, bool binary, bool add)

void	Write (std::ostream &out_stream, bool binary) const

void	Check (const AmSgmm2 &model, bool show_properties=true) const
	Checks the various accumulators for correct sizes given a model. More...

void	ResizeAccumulators (const AmSgmm2 &model, SgmmUpdateFlagsType flags, bool have_spk_vecs)
	Resizes the accumulators to the correct sizes given the model. More...

BaseFloat	Accumulate (const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, int32 pdf_index, BaseFloat weight, Sgmm2PerSpkDerivedVars *spk_vars)
	Returns likelihood. More...

BaseFloat	AccumulateFromPosteriors (const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, const Matrix< BaseFloat > &posteriors, int32 pdf_index, Sgmm2PerSpkDerivedVars *spk_vars)
	Returns count accumulated (may differ from posteriors.Sum() due to weight pruning). More...

void	CommitStatsForSpk (const AmSgmm2 &model, const Sgmm2PerSpkDerivedVars &spk_vars)
	Accumulates global stats for the current speaker (if applicable). More...

void	GetStateOccupancies (Vector< BaseFloat > *occs) const
	Accessors. More...

int32	FeatureDim () const

int32	PhoneSpaceDim () const

int32	NumPdfs () const

int32	NumGroups () const

int32	NumGauss () const

Private Member Functions
	KALDI_DISALLOW_COPY_AND_ASSIGN (MleAmSgmm2Accs)

Private Attributes
std::vector< Matrix< double > >	Y_
	The stats which are not tied to any state. More...

std::vector< Matrix< double > >	Z_
	Stats Z_{i} for speaker-subspace projections N. Dim is [I][D][T]. More...

std::vector< SpMatrix< double > >	R_
	R_{i}, quadratic term for speaker subspace estimation. Dim is [I][T][T]. More...

std::vector< SpMatrix< double > >	S_
	S_{i}^{-}, scatter of adapted feature vectors x_{i}(t). Dim is [I][D][D]. More...

std::vector< Matrix< double > >	y_
	The SGMM state specific stats. More...

std::vector< Matrix< double > >	gamma_
	Gaussian occupancies gamma_{jmi} for each substate and Gaussian index, pooled over groups. More...

std::vector< Matrix< double > >	a_
	[SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities. More...

Matrix< double >	t_
	[SSGMM] each row is one of the t_i quantities in the less-exact version of the SSGMM update for the speaker weight projections. More...

Vector< double >	a_s_
	[SSGMM], this is a per-speaker variable storing the a_i^{(s)} quantities that we will use in order to compute the non-speaker- specific quantities [see eqs. More...

std::vector< SpMatrix< double > >	U_
	the U_i quantities from the less-exact version of the SSGMM update for the speaker weight projections. More...

std::vector< Vector< double > >	gamma_c_
	Sub-state occupancies gamma_{jm}^{(c)} for each sub-state. More...

Vector< double >	gamma_s_
	gamma_{i}^{(s)}. More...

double	total_frames_

double	total_like_

int32	feature_dim_
	Dimensionality of various subspaces. More...

int32	phn_space_dim_

int32	spk_space_dim_

int32	num_gaussians_

int32	num_pdfs_

int32	num_groups_
	Other model specifications. More...

BaseFloat	rand_prune_

Friends
class	MleAmSgmm2Updater

class	EbwAmSgmm2Updater

Detailed Description

Class for the accumulators associated with the phonetic-subspace model parameters.

Definition at line 119 of file estimate-am-sgmm2.h.

Constructor & Destructor Documentation

◆ MleAmSgmm2Accs() [1/2]

MleAmSgmm2Accs ( BaseFloat rand_prune = 1.0e-05 )

inlineexplicit

Definition at line 121 of file estimate-am-sgmm2.h.

       : total_frames_(0.0), total_like_(0.0), feature_dim_(0),
         phn_space_dim_(0), spk_space_dim_(0), num_gaussians_(0),
         num_pdfs_(0), num_groups_(0), rand_prune_(rand_prune) {}

◆ MleAmSgmm2Accs() [2/2]

MleAmSgmm2Accs	(	const AmSgmm2 &	model,
		SgmmUpdateFlagsType	flags,
		bool	have_spk_vecs,
		BaseFloat	rand_prune = `1.0e-05`
	)

inline

Definition at line 126 of file estimate-am-sgmm2.h.

       : total_frames_(0.0), total_like_(0.0), rand_prune_(rand_prune) {
     ResizeAccumulators(model, flags, have_spk_vecs);
   }

◆ ~MleAmSgmm2Accs()

~MleAmSgmm2Accs ( )

Definition at line 1945 of file estimate-am-sgmm2.cc.

References MleSgmm2SpeakerAccs::gamma_s_, KALDI_ERR, and VectorBase< Real >::Sum().

                                 {
   if (gamma_s_.Sum() != 0.0)
     KALDI_ERR << "In destructor of MleAmSgmm2Accs: detected that you forgot to "
         "call CommitStatsForSpk()";
 }

Member Function Documentation

◆ Accumulate()

BaseFloat Accumulate	(	const AmSgmm2 &	model,
		const Sgmm2PerFrameDerivedVars &	frame_vars,
		int32	pdf_index,
		BaseFloat	weight,
		Sgmm2PerSpkDerivedVars *	spk_vars
	)

Returns likelihood.

Definition at line 471 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::AccumulateFromPosteriors(), AmSgmm2::ComponentPosteriors(), count, MatrixBase< Real >::Scale(), and MleAmSgmm2Accs::total_like_.

Referenced by main().

                                                                       {
   // Calculate Gaussian posteriors and collect statistics
   Matrix<BaseFloat> posteriors;
   BaseFloat log_like = model.ComponentPosteriors(frame_vars, j2, spk_vars, &posteriors);
   posteriors.Scale(weight);
   BaseFloat count = AccumulateFromPosteriors(model, frame_vars, posteriors,
                                              j2, spk_vars);
   // Note: total_frames_ is incremented in AccumulateFromPosteriors().
   total_like_ += count * log_like;
   return log_like;
 }

◆ AccumulateFromPosteriors()

BaseFloat AccumulateFromPosteriors	(	const AmSgmm2 &	model,
		const Sgmm2PerFrameDerivedVars &	frame_vars,
		const Matrix< BaseFloat > &	posteriors,
		int32	pdf_index,
		Sgmm2PerSpkDerivedVars *	spk_vars
	)

Returns count accumulated (may differ from posteriors.Sum() due to weight pruning).

Definition at line 487 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, MleAmSgmm2Accs::a_s_, VectorBase< Real >::AddVec(), Sgmm2PerSpkDerivedVars::b_is, VectorBase< Real >::Dim(), MleAmSgmm2Accs::feature_dim_, MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, MleAmSgmm2Accs::gamma_s_, AmSgmm2::GetDjms(), AmSgmm2::GetSubstateMean(), Sgmm2PerFrameDerivedVars::gselect, rnnlm::i, KALDI_ASSERT, AmSgmm2::NumSubstatesForGroup(), AmSgmm2::Pdf2Group(), MleAmSgmm2Accs::rand_prune_, kaldi::RandPrune(), MatrixBase< Real >::Row(), MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, MleAmSgmm2Accs::total_frames_, AmSgmm2::v_, Sgmm2PerSpkDerivedVars::v_s, AmSgmm2::w_jmi_, Sgmm2PerFrameDerivedVars::xt, Sgmm2PerFrameDerivedVars::xti, MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, MleAmSgmm2Accs::Z_, and Sgmm2PerFrameDerivedVars::zti.

Referenced by MleAmSgmm2Accs::Accumulate(), and main().

                                       {
   double tot_count = 0.0;
   const vector<int32> &gselect = frame_vars.gselect;
   // Intermediate variables
   Vector<BaseFloat> gammat(gselect.size()), // sum of gammas over mix-weight.
       a_is_part(gselect.size()); //
   Vector<BaseFloat> xt_jmi(feature_dim_), mu_jmi(feature_dim_),
       zt_jmi(spk_space_dim_);
 
   int32 j1 = model.Pdf2Group(j2);
   int32 num_substates = model.NumSubstatesForGroup(j1);
 
   for (int32 m = 0; m < num_substates; m++) {
     BaseFloat d_jms = model.GetDjms(j1, m, spk_vars);
     BaseFloat gammat_jm = 0.0;
     for (int32 ki = 0; ki < static_cast<int32>(gselect.size()); ki++) {
       int32 i = gselect[ki];
 
       // Eq. (39): gamma_{jmi}(t) = p (j, m, i|t)
       BaseFloat gammat_jmi = RandPrune(posteriors(ki, m), rand_prune_);
       if (gammat_jmi == 0.0) continue;
       gammat(ki) += gammat_jmi;
       if (gamma_s_.Dim() != 0)
         gamma_s_(i) += gammat_jmi;
       gammat_jm += gammat_jmi;
 
       // Accumulate statistics for non-zero gaussian posteriors
       tot_count += gammat_jmi;
       if (!gamma_.empty()) {
         // Eq. (40): gamma_{jmi} = \sum_t gamma_{jmi}(t)
         gamma_[j1](m, i) += gammat_jmi;
       }
       if (!y_.empty()) {
         // Eq. (41): y_{jm} = \sum_{t, i} \gamma_{jmi}(t) z_{i}(t)
         // Suggestion:  move this out of the loop over m
         y_[j1].Row(m).AddVec(gammat_jmi, frame_vars.zti.Row(ki));
       }
       if (!Y_.empty()) {
         // Eq. (42): Y_{i} = \sum_{t, j, m} \gamma_{jmi}(t) x_{i}(t) v_{jm}^T
         Y_[i].AddVecVec(gammat_jmi, frame_vars.xti.Row(ki),
                         model.v_[j1].Row(m));
       }
       // Accumulate for speaker projections
       if (!Z_.empty()) {
         KALDI_ASSERT(spk_space_dim_ > 0);
         // Eq. (43): x_{jmi}(t) = x_k(t) - M{i} v_{jm}
         model.GetSubstateMean(j1, m, i, &mu_jmi);
         xt_jmi.CopyFromVec(frame_vars.xt);
         xt_jmi.AddVec(-1.0, mu_jmi);
         // Eq. (44): Z_{i} = \sum_{t, j, m} \gamma_{jmi}(t) x_{jmi}(t) v^{s}'
         if (spk_vars->v_s.Dim() != 0)  // interpret empty v_s as zero.
           Z_[i].AddVecVec(gammat_jmi, xt_jmi, spk_vars->v_s);
         // Eq. (49): \gamma_{i}^{(s)} = \sum_{t\in\Tau(s), j, m} gamma_{jmi}
         // Will be used when you call CommitStatsForSpk(), to update R_.
       }
     } // loop over selected Gaussians
     if (gammat_jm != 0.0) {
       if (!a_.empty()) { // SSGMM code.
         KALDI_ASSERT(d_jms > 0);
         // below is eq. 40 in the MSR techreport.  Caution: there
         // was an error in the original techreport.  The index i
         // in the summation and the quantity \gamma_{jmi}^{(t)}
         // should be differently named, e.g. i'.
         a_[j1].Row(m).AddVec(gammat_jm / d_jms, spk_vars->b_is);
       }
       if (a_s_.Dim() != 0) { // [SSGMM]
         KALDI_ASSERT(d_jms > 0);
         KALDI_ASSERT(!model.w_jmi_.empty());
         a_s_.AddVec(gammat_jm / d_jms, model.w_jmi_[j1].Row(m));
       }
       if (!gamma_c_.empty())
         gamma_c_[j2](m) += gammat_jm;
     }
   } // loop over substates
 
   if (!S_.empty()) {
     for (int32 ki = 0; ki < static_cast<int32>(gselect.size()); ki++) {
       // Eq. (47): S_{i} = \sum_{t, j, m} \gamma_{jmi}(t) x_{i}(t) x_{i}(t)^T
       if (gammat(ki) != 0.0) {
         int32 i = gselect[ki];
         S_[i].AddVec2(gammat(ki), frame_vars.xti.Row(ki));
       }
     }
   }
   total_frames_ += tot_count;
   return tot_count;
 }

◆ Check()

void Check	(	const AmSgmm2 &	model,
		bool	show_properties = `true`
	)		const

Checks the various accumulators for correct sizes given a model.

With wrong sizes, assertion failure occurs. When the show_properties argument is set to true, dimensions and presence/absence of the various accumulators are printed. For use when accumulators are read from file.

Definition at line 213 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, kaldi::ApproxEqual(), VectorBase< Real >::Dim(), MleAmSgmm2Accs::feature_dim_, AmSgmm2::FeatureDim(), MatrixBase< Real >::FrobeniusNorm(), MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, MleAmSgmm2Accs::gamma_s_, rnnlm::i, VectorBase< Real >::IsZero(), KALDI_ASSERT, KALDI_ERR, KALDI_LOG, KALDI_WARN, MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::num_groups_, MleAmSgmm2Accs::num_pdfs_, MatrixBase< Real >::NumCols(), AmSgmm2::NumGauss(), AmSgmm2::NumGroups(), AmSgmm2::NumPdfs(), MatrixBase< Real >::NumRows(), AmSgmm2::NumSubstatesForGroup(), AmSgmm2::NumSubstatesForPdf(), MleAmSgmm2Accs::phn_space_dim_, AmSgmm2::PhoneSpaceDim(), MleAmSgmm2Accs::R_, MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, AmSgmm2::SpkSpaceDim(), MleAmSgmm2Accs::t_, MleAmSgmm2Accs::U_, MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, and MleAmSgmm2Accs::Z_.

Referenced by main(), and TestSgmm2AccsIO().

                                                       {
   if (show_properties)
     KALDI_LOG << "Sgmm2PdfModel: J1 = " << num_groups_ << ", J2 = "
               << num_pdfs_ << ", D = " << feature_dim_ << ", S = "
               << phn_space_dim_ << ", T = " << spk_space_dim_ << ", I = "
               << num_gaussians_;
 
   KALDI_ASSERT(num_pdfs_ == model.NumPdfs() && num_pdfs_ > 0);
   KALDI_ASSERT(num_groups_ == model.NumGroups() && num_groups_ > 0);
   KALDI_ASSERT(num_gaussians_ == model.NumGauss() && num_gaussians_ > 0);
   KALDI_ASSERT(feature_dim_ == model.FeatureDim() && feature_dim_ > 0);
   KALDI_ASSERT(phn_space_dim_ == model.PhoneSpaceDim() && phn_space_dim_ > 0);
   KALDI_ASSERT(spk_space_dim_ == model.SpkSpaceDim());
 
   std::ostringstream debug_str;
 
   if (Y_.size() == 0) {
     debug_str << "Y: no.  ";
   } else {
     KALDI_ASSERT(gamma_.size() != 0);
     KALDI_ASSERT(Y_.size() == static_cast<size_t>(num_gaussians_));
     bool nz = false;
     for (int32 i = 0; i < num_gaussians_; i++) {
       KALDI_ASSERT(Y_[i].NumRows() == feature_dim_ &&
                    Y_[i].NumCols() == phn_space_dim_);
       if (!nz && Y_[i](0, 0) != 0) { nz = true; }
     }
     debug_str << "Y: yes, " << string(nz ? "nonzero. " : "zero. ");
   }
 
   if (Z_.size() == 0) {
     KALDI_ASSERT(R_.size() == 0);
     debug_str << "Z, R: no.  ";
   } else {
     KALDI_ASSERT(gamma_s_.Dim() == num_gaussians_);
     KALDI_ASSERT(Z_.size() == static_cast<size_t>(num_gaussians_));
     KALDI_ASSERT(R_.size() == static_cast<size_t>(num_gaussians_));
     bool Z_nz = false, R_nz = false;
     for (int32 i = 0; i < num_gaussians_; i++) {
       KALDI_ASSERT(Z_[i].NumRows() == feature_dim_ &&
                    Z_[i].NumCols() == spk_space_dim_);
       KALDI_ASSERT(R_[i].NumRows() == spk_space_dim_);
       if (!Z_nz && Z_[i](0, 0) != 0) { Z_nz = true; }
       if (!R_nz && R_[i](0, 0) != 0) { R_nz = true; }
     }
     bool gamma_s_nz = !gamma_s_.IsZero();
     debug_str << "Z: yes, " << string(Z_nz ? "nonzero. " : "zero. ");
     debug_str << "R: yes, " << string(R_nz ? "nonzero. " : "zero. ");
     debug_str << "gamma_s: yes, " << string(gamma_s_nz ? "nonzero. " : "zero. ");
   }
 
   if (S_.size() == 0) {
     debug_str << "S: no.  ";
   } else {
     KALDI_ASSERT(gamma_.size() != 0);
     bool S_nz = false;
     KALDI_ASSERT(S_.size() == static_cast<size_t>(num_gaussians_));
     for (int32 i = 0; i < num_gaussians_; i++) {
       KALDI_ASSERT(S_[i].NumRows() == feature_dim_);
       if (!S_nz && S_[i](0, 0) != 0) { S_nz = true; }
     }
     debug_str << "S: yes, " << string(S_nz ? "nonzero. " : "zero. ");
   }
 
   if (y_.size() == 0) {
     debug_str << "y: no.  ";
   } else {
     KALDI_ASSERT(gamma_.size() != 0);
     bool nz = false;
     KALDI_ASSERT(y_.size() == static_cast<size_t>(num_groups_));
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       KALDI_ASSERT(y_[j1].NumRows() == model.NumSubstatesForGroup(j1));
       KALDI_ASSERT(y_[j1].NumCols() == phn_space_dim_);
       if (!nz && y_[j1](0, 0) != 0) { nz = true; }
     }
     debug_str << "y: yes, " << string(nz ? "nonzero. " : "zero. ");
   }
 
   if (a_.size() == 0) {
     debug_str << "a: no.  ";
   } else {
     debug_str << "a: yes.  ";
     bool nz = false;
     KALDI_ASSERT(a_.size() == static_cast<size_t>(num_groups_));
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       KALDI_ASSERT(a_[j1].NumRows() == model.NumSubstatesForGroup(j1) &&
                    a_[j1].NumCols() == num_gaussians_);
       if (!nz && a_[j1].Sum() != 0) nz = true;
     }
     debug_str << "a: yes, " << string(nz ? "nonzero. " : "zero. "); // TODO: take out "string"
   }
 
   double tot_gamma = 0.0;
   if (gamma_.size() == 0) {
     debug_str << "gamma: no.  ";
   } else {
     debug_str << "gamma: yes.  ";
     KALDI_ASSERT(gamma_.size() == static_cast<size_t>(num_groups_));
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       KALDI_ASSERT(gamma_[j1].NumRows() == model.NumSubstatesForGroup(j1) &&
                    gamma_[j1].NumCols() == num_gaussians_);
       tot_gamma += gamma_[j1].Sum();
     }
     bool nz = (tot_gamma != 0.0);
     KALDI_ASSERT(gamma_c_.size() == num_pdfs_ && "gamma_ set up but not gamma_c_.");
     debug_str << "gamma: yes, " << string(nz ? "nonzero. " : "zero. ");
   }
 
   if (gamma_c_.size() == 0) {
     KALDI_ERR << "gamma_c_ not set up."; // required for all accs.
   } else {
     KALDI_ASSERT(gamma_c_.size() == num_pdfs_);
     double tot_gamma_c = 0.0;
     for (int32 j2 = 0; j2 < num_pdfs_; j2++) {
       KALDI_ASSERT(gamma_c_[j2].Dim() == model.NumSubstatesForPdf(j2));
       tot_gamma_c += gamma_c_[j2].Sum();
     }
     bool nz = (tot_gamma_c != 0.0);
     debug_str << "gamma_c: yes, " << string(nz ? "nonzero. " : "zero. ");
     if (!gamma_.empty() && !ApproxEqual(tot_gamma_c, tot_gamma))
       KALDI_WARN << "Counts from gamma and gamma_c differ "
                  << tot_gamma << " vs. " << tot_gamma_c;
   }
 
   if (t_.NumRows() == 0) {
     debug_str << "t: no.  ";
   } else {
     KALDI_ASSERT(t_.NumRows() == num_gaussians_ &&
                  t_.NumCols() == spk_space_dim_);
     KALDI_ASSERT(!U_.empty()); // t and U are used together.
     bool nz = (t_.FrobeniusNorm() != 0);
     debug_str << "t: yes, " << string(nz ? "nonzero. " : "zero. ");
   }
 
   if (U_.size() == 0) {
     debug_str << "U: no.  ";
   } else {
     bool nz = false;
     KALDI_ASSERT(U_.size() == num_gaussians_);
     for (int32 i = 0; i < num_gaussians_; i++) {
       if (!nz && U_[i].FrobeniusNorm() != 0) nz = true;
       KALDI_ASSERT(U_[i].NumRows() == spk_space_dim_);
     }
     KALDI_ASSERT(t_.NumRows() != 0); // t and U are used together.
     debug_str << "t: yes, " << string(nz ? "nonzero. " : "zero. ");
   }
 
   if (show_properties)
     KALDI_LOG << "Subspace GMM model properties: " << debug_str.str();
 }

◆ CommitStatsForSpk()

void CommitStatsForSpk	(	const AmSgmm2 &	model,
		const Sgmm2PerSpkDerivedVars &	spk_vars
	)

Accumulates global stats for the current speaker (if applicable).

If flags contains kSgmmSpeakerProjections (N), or kSgmmSpeakerWeightProjections (u), must call this after finishing the speaker's data.

Definition at line 580 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_s_, VectorBase< Real >::AddVecVec(), MatrixBase< Real >::AddVecVec(), Sgmm2PerSpkDerivedVars::b_is, VectorBase< Real >::Dim(), MleAmSgmm2Accs::gamma_s_, rnnlm::i, VectorBase< Real >::IsZero(), MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::R_, VectorBase< Real >::SetZero(), MleAmSgmm2Accs::t_, MleAmSgmm2Accs::U_, and Sgmm2PerSpkDerivedVars::v_s.

Referenced by main().

                                                                                {
   const VectorBase<BaseFloat> &v_s = spk_vars.v_s;
   if (v_s.Dim() != 0 && !v_s.IsZero() && !R_.empty()) {
     for (int32 i = 0; i < num_gaussians_; i++)
       // Accumulate Statistics R_{ki}
       if (gamma_s_(i) != 0.0)
         R_[i].AddVec2(gamma_s_(i),
                       Vector<double>(v_s));
   }
   if (a_s_.Dim() != 0) {
     Vector<BaseFloat> tmp(gamma_s_);
     // tmp(i) = gamma_s^{(i)} - a_i^{(s)} b_i^{(s)}.
     tmp.AddVecVec(-1.0, Vector<BaseFloat>(a_s_), spk_vars.b_is, 1.0);
     t_.AddVecVec(1.0, tmp, v_s); // eq. 53 of techreport.
     for (int32 i = 0; i < num_gaussians_; i++) {
       U_[i].AddVec2(a_s_(i) * spk_vars.b_is(i),
                     Vector<double>(v_s)); // eq. 54 of techreport.
     }
   }
   gamma_s_.SetZero();
   a_s_.SetZero();
 }

◆ FeatureDim()

int32 FeatureDim ( ) const

inline

Definition at line 173 of file estimate-am-sgmm2.h.

173 { return feature_dim_; }

kaldi::MleAmSgmm2Accs::feature_dim_

int32 feature_dim_

Dimensionality of various subspaces.

Definition: estimate-am-sgmm2.h:233

◆ GetStateOccupancies()

void GetStateOccupancies ( Vector< BaseFloat > * occs ) const

Accessors.

Definition at line 604 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::gamma_c_, and Vector< Real >::Resize().

Referenced by main().

                                                                       {
   int32 J2 = gamma_c_.size();
   occs->Resize(J2);
   for (int32 j2 = 0; j2 < J2; j2++) {
     (*occs)(j2) = gamma_c_[j2].Sum();
   }
 }

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( MleAmSgmm2Accs )

private

◆ NumGauss()

int32 NumGauss ( ) const

inline

Definition at line 177 of file estimate-am-sgmm2.h.

177 { return num_gaussians_; }

kaldi::MleAmSgmm2Accs::num_gaussians_

int32 num_gaussians_

Definition: estimate-am-sgmm2.h:234

◆ NumGroups()

int32 NumGroups ( ) const

inline

Definition at line 176 of file estimate-am-sgmm2.h.

176 { return num_groups_; } // returns J1

kaldi::MleAmSgmm2Accs::num_groups_

int32 num_groups_

Other model specifications.

Definition: estimate-am-sgmm2.h:234

◆ NumPdfs()

int32 NumPdfs ( ) const

inline

Definition at line 175 of file estimate-am-sgmm2.h.

175 { return num_pdfs_; } // returns J2

kaldi::MleAmSgmm2Accs::num_pdfs_

int32 num_pdfs_

Definition: estimate-am-sgmm2.h:234

◆ PhoneSpaceDim()

int32 PhoneSpaceDim ( ) const

inline

Definition at line 174 of file estimate-am-sgmm2.h.

174 { return phn_space_dim_; }

kaldi::MleAmSgmm2Accs::phn_space_dim_

int32 phn_space_dim_

Definition: estimate-am-sgmm2.h:233

◆ Read()

void Read	(	std::istream &	in_stream,
		bool	binary,
		bool	add
	)

Definition at line 122 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, VectorBase< Real >::Dim(), kaldi::ExpectToken(), MleAmSgmm2Accs::feature_dim_, MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, MleAmSgmm2Accs::gamma_s_, rnnlm::i, KALDI_ERR, MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::num_groups_, MleAmSgmm2Accs::num_pdfs_, MleAmSgmm2Accs::phn_space_dim_, MleAmSgmm2Accs::R_, Matrix< Real >::Read(), kaldi::ReadBasicType(), kaldi::ReadToken(), Vector< Real >::Resize(), MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, MleAmSgmm2Accs::t_, MleAmSgmm2Accs::total_frames_, MleAmSgmm2Accs::total_like_, MleAmSgmm2Accs::U_, MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, and MleAmSgmm2Accs::Z_.

Referenced by main(), and TestSgmm2AccsIO().

                                    {
   ExpectToken(in_stream, binary, "<SGMMACCS>");
   ExpectToken(in_stream, binary, "<NUMPDFS>");
   ReadBasicType(in_stream, binary, &num_pdfs_);
   ExpectToken(in_stream, binary, "<NUMGROUPS>");
   ReadBasicType(in_stream, binary, &num_groups_);
   ExpectToken(in_stream, binary, "<NUMGaussians>");
   ReadBasicType(in_stream, binary, &num_gaussians_);
   ExpectToken(in_stream, binary, "<FEATUREDIM>");
   ReadBasicType(in_stream, binary, &feature_dim_);
   ExpectToken(in_stream, binary, "<PHONESPACEDIM>");
   ReadBasicType(in_stream, binary, &phn_space_dim_);
   ExpectToken(in_stream, binary, "<SPKSPACEDIM>");
   ReadBasicType(in_stream, binary, &spk_space_dim_);
 
   string token;
   ReadToken(in_stream, binary, &token);
 
   while (token != "</SGMMACCS>") {
     if (token == "<Y>") {
       Y_.resize(num_gaussians_);
       for (size_t i = 0; i < Y_.size(); i++) {
         Y_[i].Read(in_stream, binary, add);
       }
     } else if (token == "<Z>") {
       Z_.resize(num_gaussians_);
       for (size_t i = 0; i < Z_.size(); i++) {
         Z_[i].Read(in_stream, binary, add);
       }
     } else if (token == "<R>") {
       R_.resize(num_gaussians_);
       if (gamma_s_.Dim() == 0) gamma_s_.Resize(num_gaussians_);
       for (size_t i = 0; i < R_.size(); i++) {
         R_[i].Read(in_stream, binary, add);
       }
     } else if (token == "<S>") {
       S_.resize(num_gaussians_);
       for (size_t i = 0; i < S_.size(); i++) {
         S_[i].Read(in_stream, binary, add);
       }
     } else if (token == "<y>") {
       y_.resize(num_groups_);
       for (int32 j1 = 0; j1 < num_groups_; j1++) {
         y_[j1].Read(in_stream, binary, add);
       }
     } else if (token == "<gamma>") {
       gamma_.resize(num_groups_);
       for (int32 j1 = 0; j1 < num_groups_; j1++) {
         gamma_[j1].Read(in_stream, binary, add);
       }
       // Don't read gamma_s, it's just a temporary variable and
       // not part of the permanent (non-speaker-specific) accs.
     } else if (token == "<a>") {
       a_.resize(num_groups_);
       for (int32 j1 = 0; j1 < num_groups_; j1++) {
         a_[j1].Read(in_stream, binary, add);
       }
     } else if (token == "<gamma_c>") {
       gamma_c_.resize(num_pdfs_);
       for (int32 j2 = 0; j2 < num_pdfs_; j2++) {
         gamma_c_[j2].Read(in_stream, binary, add);
       }
     } else if (token == "<t>") {
       t_.Read(in_stream, binary, add);
     } else if (token == "<U>") {
       U_.resize(num_gaussians_);
       for (int32 i = 0; i < num_gaussians_; i++) {
         U_[i].Read(in_stream, binary, add);
       }
     } else if (token == "<total_like>") {
       double total_like;
       ReadBasicType(in_stream, binary, &total_like);
       if (add)
         total_like_ += total_like;
       else
         total_like_ = total_like;
     } else if (token == "<total_frames>") {
       double total_frames;
       ReadBasicType(in_stream, binary, &total_frames);
       if (add)
         total_frames_ += total_frames;
       else
         total_frames_ = total_frames;
     } else {
       KALDI_ERR << "Unexpected token '" << token << "' in model file ";
     }
     ReadToken(in_stream, binary, &token);
   }
 }

◆ ResizeAccumulators()

void ResizeAccumulators	(	const AmSgmm2 &	model,
		SgmmUpdateFlagsType	flags,
		bool	have_spk_vecs
	)

Resizes the accumulators to the correct sizes given the model.

The flags argument controls which accumulators to resize.

Definition at line 365 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, MleAmSgmm2Accs::a_s_, MleAmSgmm2Accs::feature_dim_, AmSgmm2::FeatureDim(), MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, MleAmSgmm2Accs::gamma_s_, AmSgmm2::HasSpeakerDependentWeights(), rnnlm::i, KALDI_ASSERT, KALDI_ERR, kaldi::kSgmmCovarianceMatrix, kaldi::kSgmmPhoneProjections, kaldi::kSgmmPhoneVectors, kaldi::kSgmmPhoneWeightProjections, kaldi::kSgmmSpeakerProjections, kaldi::kSgmmSpeakerWeightProjections, MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::num_groups_, MleAmSgmm2Accs::num_pdfs_, AmSgmm2::NumGauss(), AmSgmm2::NumGroups(), AmSgmm2::NumPdfs(), AmSgmm2::NumSubstatesForGroup(), AmSgmm2::NumSubstatesForPdf(), MleAmSgmm2Accs::phn_space_dim_, AmSgmm2::PhoneSpaceDim(), MleAmSgmm2Accs::R_, Vector< Real >::Resize(), Matrix< Real >::Resize(), MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, AmSgmm2::SpkSpaceDim(), MleAmSgmm2Accs::t_, MleAmSgmm2Accs::total_frames_, MleAmSgmm2Accs::total_like_, MleAmSgmm2Accs::U_, MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, and MleAmSgmm2Accs::Z_.

Referenced by main().

                                                             {
   num_pdfs_ = model.NumPdfs();
   num_groups_ = model.NumGroups();
   num_gaussians_ = model.NumGauss();
   feature_dim_ = model.FeatureDim();
   phn_space_dim_ = model.PhoneSpaceDim();
   spk_space_dim_ = model.SpkSpaceDim();
   total_frames_ = total_like_ = 0;
 
   if (flags & (kSgmmPhoneProjections | kSgmmCovarianceMatrix)) {
     Y_.resize(num_gaussians_);
     for (int32 i = 0; i < num_gaussians_; i++) {
       Y_[i].Resize(feature_dim_, phn_space_dim_);
     }
   } else {
     Y_.clear();
   }
 
   if (flags & (kSgmmSpeakerProjections | kSgmmSpeakerWeightProjections)) {
     gamma_s_.Resize(num_gaussians_);
   } else {
     gamma_s_.Resize(0);
   }
 
   if (flags & kSgmmSpeakerProjections) {
     if (spk_space_dim_ == 0) {
       KALDI_ERR << "Cannot set up accumulators for speaker projections "
                 << "because speaker subspace has not been set up";
     }
     Z_.resize(num_gaussians_);
     R_.resize(num_gaussians_);
     for (int32 i = 0; i < num_gaussians_; i++) {
       Z_[i].Resize(feature_dim_, spk_space_dim_);
       R_[i].Resize(spk_space_dim_);
     }
   } else {
     Z_.clear();
     R_.clear();
   }
 
   if (flags & kSgmmCovarianceMatrix) {
     S_.resize(num_gaussians_);
     for (int32 i = 0; i < num_gaussians_; i++) {
       S_[i].Resize(feature_dim_);
     }
   } else {
     S_.clear();
   }
 
   if (flags & (kSgmmPhoneVectors | kSgmmPhoneWeightProjections |
                kSgmmCovarianceMatrix | kSgmmPhoneProjections)) {
     gamma_.resize(num_groups_);
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       gamma_[j1].Resize(model.NumSubstatesForGroup(j1), num_gaussians_);
     }
   } else {
     gamma_.clear();
   }
 
   if (flags & (kSgmmPhoneVectors | kSgmmPhoneWeightProjections)
       && model.HasSpeakerDependentWeights() && have_spk_vecs) { // SSGMM code.
     a_.resize(num_groups_);
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       a_[j1].Resize(model.NumSubstatesForGroup(j1),
                     num_gaussians_);
     }
   } else {
     a_.clear();
   }
 
   if (flags & kSgmmSpeakerWeightProjections) {
     KALDI_ASSERT(model.HasSpeakerDependentWeights() &&
                  "remove the flag \"u\" if you don't have u set up.");
     a_s_.Resize(num_gaussians_);
     t_.Resize(num_gaussians_, spk_space_dim_);
     U_.resize(num_gaussians_);
     for (int32 i = 0; i < num_gaussians_; i++)
       U_[i].Resize(spk_space_dim_);
   } else {
     a_s_.Resize(0);
     t_.Resize(0, 0);
     U_.resize(0);
   }
 
   if (true) { // always set up gamma_c_; it's nominally for
     // estimation of substate weights, but it's also required when
     // GetStateOccupancies() is called.
     gamma_c_.resize(num_pdfs_);
     for (int32 j2 = 0; j2 < num_pdfs_; j2++) {
       gamma_c_[j2].Resize(model.NumSubstatesForPdf(j2));
     }
   }
 
 
   if (flags & kSgmmPhoneVectors) {
     y_.resize(num_groups_);
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       y_[j1].Resize(model.NumSubstatesForGroup(j1), phn_space_dim_);
     }
   } else {
     y_.clear();
   }
 }

◆ Write()

void Write	(	std::ostream &	out_stream,
		bool	binary
	)		const

Definition at line 34 of file estimate-am-sgmm2.cc.

References MleAmSgmm2Accs::a_, MleAmSgmm2Accs::feature_dim_, MleAmSgmm2Accs::gamma_, MleAmSgmm2Accs::gamma_c_, rnnlm::i, KALDI_ASSERT, MleAmSgmm2Accs::num_gaussians_, MleAmSgmm2Accs::num_groups_, MleAmSgmm2Accs::num_pdfs_, MatrixBase< Real >::NumRows(), MleAmSgmm2Accs::phn_space_dim_, MleAmSgmm2Accs::R_, MleAmSgmm2Accs::S_, MleAmSgmm2Accs::spk_space_dim_, MleAmSgmm2Accs::t_, MleAmSgmm2Accs::total_frames_, MleAmSgmm2Accs::total_like_, MleAmSgmm2Accs::U_, MatrixBase< Real >::Write(), kaldi::WriteBasicType(), kaldi::WriteToken(), MleAmSgmm2Accs::Y_, MleAmSgmm2Accs::y_, and MleAmSgmm2Accs::Z_.

Referenced by main(), and TestSgmm2AccsIO().

                                                                     {
 
   WriteToken(out_stream, binary, "<SGMMACCS>");
   WriteToken(out_stream, binary, "<NUMPDFS>");
   WriteBasicType(out_stream, binary, num_pdfs_);
   WriteToken(out_stream, binary, "<NUMGROUPS>");
   WriteBasicType(out_stream, binary, num_groups_);
   WriteToken(out_stream, binary, "<NUMGaussians>");
   WriteBasicType(out_stream, binary, num_gaussians_);
   WriteToken(out_stream, binary, "<FEATUREDIM>");
   WriteBasicType(out_stream, binary, feature_dim_);
   WriteToken(out_stream, binary, "<PHONESPACEDIM>");
   WriteBasicType(out_stream, binary, phn_space_dim_);
   WriteToken(out_stream, binary, "<SPKSPACEDIM>");
   WriteBasicType(out_stream, binary, spk_space_dim_);
   if (!binary) out_stream << "\n";
 
   if (Y_.size() != 0) {
     KALDI_ASSERT(gamma_.size() != 0);
     WriteToken(out_stream, binary, "<Y>");
     for (int32 i = 0; i < num_gaussians_; i++) {
       Matrix<BaseFloat>(Y_[i]).Write(out_stream, binary);
     }
   }
   if (Z_.size() != 0) {
     KALDI_ASSERT(R_.size() != 0);
     WriteToken(out_stream, binary, "<Z>");
     for (int32 i = 0; i < num_gaussians_; i++) {
       Matrix<BaseFloat>(Z_[i]).Write(out_stream, binary);
     }
     WriteToken(out_stream, binary, "<R>");
     for (int32 i = 0; i < num_gaussians_; i++) {
       SpMatrix<BaseFloat>(R_[i]).Write(out_stream, binary);
     }
   }
   if (S_.size() != 0) {
     KALDI_ASSERT(gamma_.size() != 0);
     WriteToken(out_stream, binary, "<S>");
     for (int32 i = 0; i < num_gaussians_; i++) {
       SpMatrix<BaseFloat>(S_[i]).Write(out_stream, binary);
     }
   }
   if (y_.size() != 0) {
     KALDI_ASSERT(gamma_.size() != 0);
     WriteToken(out_stream, binary, "<y>");
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       Matrix<BaseFloat>(y_[j1]).Write(out_stream, binary);
     }
   }
   if (gamma_.size() != 0) { // These stats are large
     // -> write as single precision.
     WriteToken(out_stream, binary, "<gamma>");
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       Matrix<BaseFloat> gamma_j1(gamma_[j1]);
       gamma_j1.Write(out_stream, binary);
     }
   }
   if (t_.NumRows() != 0) {
     WriteToken(out_stream, binary, "<t>");
     Matrix<BaseFloat>(t_).Write(out_stream, binary);
   }
   if (U_.size() != 0) {
     WriteToken(out_stream, binary, "<U>");
     for (int32 i = 0; i < num_gaussians_; i++) {
       SpMatrix<BaseFloat>(U_[i]).Write(out_stream, binary);
     }
   }
   if (gamma_c_.size() != 0) {
     WriteToken(out_stream, binary, "<gamma_c>");
     for (int32 j2 = 0; j2 < num_pdfs_; j2++) {
       Vector<BaseFloat>(gamma_c_[j2]).Write(out_stream, binary);
     }
   }
   if (a_.size() != 0) {
     WriteToken(out_stream, binary, "<a>");
     for (int32 j1 = 0; j1 < num_groups_; j1++) {
       Matrix<BaseFloat>(a_[j1]).Write(out_stream, binary);
     }
   }
   WriteToken(out_stream, binary, "<total_like>");
   WriteBasicType(out_stream, binary, total_like_);
 
   WriteToken(out_stream, binary, "<total_frames>");
   WriteBasicType(out_stream, binary, total_frames_);
 
   WriteToken(out_stream, binary, "</SGMMACCS>");
 }

Friends And Related Function Documentation

◆ EbwAmSgmm2Updater

friend class EbwAmSgmm2Updater

friend

Definition at line 240 of file estimate-am-sgmm2.h.

◆ MleAmSgmm2Updater

friend class MleAmSgmm2Updater

friend

Definition at line 239 of file estimate-am-sgmm2.h.

Member Data Documentation

◆ a_

std::vector< Matrix<double> > a_

private

[SSGMM] These a_{jmi} quantities are dimensionally the same as the gamma quantities.

They're needed to estimate the v_{jm} and w_i quantities in the symmetric SGMM. Dimension is [J1][#mix][S]

Definition at line 200 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), MleAmSgmm2Updater::ComputeLogA(), EbwAmSgmm2Updater::ComputePhoneVecStats(), MleAmSgmm2Accs::Read(), MleAmSgmm2Accs::ResizeAccumulators(), MleAmSgmm2Updater::Update(), and MleAmSgmm2Accs::Write().

◆ a_s_

Vector<double> a_s_

private

[SSGMM], this is a per-speaker variable storing the a_i^{(s)} quantities that we will use in order to compute the non-speaker- specific quantities [see eqs.

53 and 54 in techreport]. Note: there is a separate variable a_s_ in class MleSgmm2SpeakerAccs, which is the same thing but for purposes of computing the speaker-vector v^{(s)}.

Definition at line 213 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::CommitStatsForSpk(), and MleAmSgmm2Accs::ResizeAccumulators().

◆ feature_dim_

int32 feature_dim_

private

Dimensionality of various subspaces.

Definition at line 233 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), MleAmSgmm2Updater::ComputeSMeans(), MleAmSgmm2Updater::MapUpdateM(), MleAmSgmm2Accs::Read(), MleAmSgmm2Updater::RenormalizeN(), MleAmSgmm2Updater::RenormalizeV(), MleAmSgmm2Accs::ResizeAccumulators(), MleAmSgmm2Updater::UpdateM(), EbwAmSgmm2Updater::UpdateN(), MleAmSgmm2Updater::UpdateVars(), and MleAmSgmm2Accs::Write().

◆ gamma_

std::vector< Matrix<double> > gamma_

private

Gaussian occupancies gamma_{jmi} for each substate and Gaussian index, pooled over groups.

Dim is [J1][#mix][I].

Definition at line 195 of file estimate-am-sgmm2.h.

◆ gamma_c_

std::vector< Vector<double> > gamma_c_

private

Sub-state occupancies gamma_{jm}^{(c)} for each sub-state.

In the SCTM version of the SGMM, for compactness we store two separate sets of gamma statistics, one to estimate the v_{jm} quantities and one to estimate the sub-state weights c_{jm}.

Definition at line 223 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), MleAmSgmm2Accs::GetStateOccupancies(), MleAmSgmm2Accs::Read(), MleAmSgmm2Accs::ResizeAccumulators(), EbwAmSgmm2Updater::UpdateSubstateWeights(), MleAmSgmm2Updater::UpdateSubstateWeights(), and MleAmSgmm2Accs::Write().

◆ gamma_s_

Vector<double> gamma_s_

private

gamma_{i}^{(s)}.

Per-speaker counts for each Gaussian. Dimension is [I] Needed for stats R_. This can be viewed as a temporary variable; it does not form part of the stats that we eventually dump to disk.

Definition at line 228 of file estimate-am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), MleAmSgmm2Accs::CommitStatsForSpk(), MleAmSgmm2Accs::Read(), and MleAmSgmm2Accs::ResizeAccumulators().