Class for definition of the subspace Gmm acoustic model. More...

#include <am-sgmm2.h>

Collaboration diagram for AmSgmm2:

[legend]

Public Member Functions
	AmSgmm2 ()

void	Read (std::istream &is, bool binary)

void	Write (std::ostream &os, bool binary, SgmmWriteFlagsType write_params) const

void	Check (bool show_properties=true)
	Checks the various components for correct sizes. More...

void	InitializeFromFullGmm (const FullGmm &gmm, const std::vector< int32 > &pdf2group, int32 phn_subspace_dim, int32 spk_subspace_dim, bool speaker_dependent_weights, BaseFloat self_weight)
	Initializes the SGMM parameters from a full-covariance UBM. More...

void	CopyGlobalsInitVecs (const AmSgmm2 &other, const std::vector< int32 > &pdf2group, BaseFloat self_weight)
	Copies the global parameters from the supplied model, but sets the state vectors to zero. More...

void	CopyFromSgmm2 (const AmSgmm2 &other, bool copy_normalizers, bool copy_weights)
	Used to copy models (useful in update) More...

BaseFloat	GaussianSelection (const Sgmm2GselectConfig &config, const VectorBase< BaseFloat > &data, std::vector< int32 > *gselect) const
	Computes the top-scoring Gaussian indices (used for pruning of later stages of computation). More...

void	ComputePerFrameVars (const VectorBase< BaseFloat > &data, const std::vector< int32 > &gselect, const Sgmm2PerSpkDerivedVars &spk_vars, Sgmm2PerFrameDerivedVars *per_frame_vars) const
	This needs to be called with each new frame of data, prior to accumulation or likelihood evaluation: it computes various pre-computed quantities. More...

void	ComputePerSpkDerivedVars (Sgmm2PerSpkDerivedVars *vars) const
	Computes the per-speaker derived vars; assumes vars->v_s is already set up. More...

BaseFloat	LogLikelihood (const Sgmm2PerFrameDerivedVars &per_frame_vars, int32 j2, Sgmm2LikelihoodCache cache, Sgmm2PerSpkDerivedVars spk_vars, BaseFloat log_prune=0.0) const
	This does a likelihood computation for a given state using the pre-selected Gaussian components (in per_frame_vars). More...

BaseFloat	ComponentPosteriors (const Sgmm2PerFrameDerivedVars &per_frame_vars, int32 j2, Sgmm2PerSpkDerivedVars spk_vars, Matrix< BaseFloat > post) const
	Similar to LogLikelihood() function above, but also computes the posterior probabilities for the pre-selected Gaussian components and all substates. More...

void	SplitSubstates (const Vector< BaseFloat > &state_occupancies, const Sgmm2SplitSubstatesConfig &config)
	Increases the total number of substates based on the state occupancies. More...

void	IncreasePhoneSpaceDim (int32 target_dim, const Matrix< BaseFloat > &norm_xform)
	Functions for increasing the phonetic and speaker space dimensions. More...

void	IncreaseSpkSpaceDim (int32 target_dim, const Matrix< BaseFloat > &norm_xform, bool speaker_dependent_weights)
	Increase the subspace dimension for speakers. More...

void	ComputeDerivedVars ()
	Computes (and initializes if necessary) derived vars... More...

void	ComputeNormalizers ()
	Computes the data-independent terms in the log-likelihood computation for each Gaussian component and all substates. More...

void	ComputeWeights ()
	Computes the weights w_jmi_, which is needed for likelihood evaluation with SSGMMs. More...

void	ComputeFmllrPreXform (const Vector< BaseFloat > &pdf_occs, Matrix< BaseFloat > xform, Matrix< BaseFloat > inv_xform, Vector< BaseFloat > *diag_mean_scatter) const
	Computes the LDA-like pre-transform and its inverse as well as the eigenvalues of the scatter of the means used in FMLLR estimation. More...

int32	NumPdfs () const
	Various model dimensions. More...

int32	NumGroups () const

int32	Pdf2Group (int32 j2) const

int32	NumSubstatesForPdf (int32 j2) const

int32	NumSubstatesForGroup (int32 j1) const

int32	NumGauss () const

int32	PhoneSpaceDim () const

int32	SpkSpaceDim () const

int32	FeatureDim () const

bool	HasSpeakerDependentWeights () const
	True if doing SSGMM. More...

bool	HasSpeakerSpace () const

void	RemoveSpeakerSpace ()

BaseFloat	GetDjms (int32 j1, int32 m, Sgmm2PerSpkDerivedVars *spk_vars) const

const FullGmm &	full_ubm () const
	Accessors. More...

const DiagGmm &	diag_ubm () const

template<typename Real >
void	GetInvCovars (int32 gauss_index, SpMatrix< Real > *out) const
	Templated accessors (used to accumulate in different precision) More...

template<typename Real >
void	GetSubstateMean (int32 j1, int32 m, int32 i, VectorBase< Real > *mean_out) const

template<typename Real >
void	GetNtransSigmaInv (std::vector< Matrix< Real > > *out) const

template<typename Real >
void	GetSubstateSpeakerMean (int32 j1, int32 substate, int32 gauss, const Sgmm2PerSpkDerivedVars &spk, VectorBase< Real > *mean_out) const

template<typename Real >
void	GetVarScaledSubstateSpeakerMean (int32 j1, int32 substate, int32 gauss, const Sgmm2PerSpkDerivedVars &spk, VectorBase< Real > *mean_out) const

template<class Real >
void	ComputeH (std::vector< SpMatrix< Real > > *H_i) const
	Computes quantities H = M_i Sigma_i^{-1} M_i^T. More...

Protected Attributes
std::vector< int32 >	pdf2group_

std::vector< std::vector< int32 > >	group2pdf_

DiagGmm	diag_ubm_
	These contain the "background" model associated with the subspace GMM. More...

FullGmm	full_ubm_

std::vector< SpMatrix< BaseFloat > >	SigmaInv_
	Globally shared parameters of the subspace GMM. More...

std::vector< Matrix< BaseFloat > >	M_
	Phonetic-subspace projections. Dimension is [I][D][S]. More...

std::vector< Matrix< BaseFloat > >	N_
	Speaker-subspace projections. Dimension is [I][D][T]. More...

Matrix< BaseFloat >	w_
	Phonetic-subspace weight projection vectors. Dimension is [I][S]. More...

Matrix< BaseFloat >	u_
	[SSGMM] Speaker-subspace weight projection vectors. Dimension is [I][T] More...

std::vector< Matrix< BaseFloat > >	v_
	The parameters in a particular SGMM state. More...

std::vector< Vector< BaseFloat > >	c_
	c_{jm}, mixture weights. Dimension is [J2][#mix] More...

std::vector< Matrix< BaseFloat > >	n_
	n_{jim}, per-Gaussian normalizer. Dimension is [J1][I][#mix] More...

std::vector< Matrix< BaseFloat > >	w_jmi_
	[SSGMM] w_{jmi}, dimension is [J1][#mix][I]. Computed from w_ and v_. More...

std::vector< Matrix< BaseFloat > >	M_prior_

SpMatrix< BaseFloat >	row_cov_inv_

SpMatrix< BaseFloat >	col_cov_inv_

Private Member Functions
void	ComputeGammaI (const Vector< BaseFloat > &state_occupancies, Vector< BaseFloat > *gamma_i) const
	Computes quasi-occupancies gamma_i from the state-level occupancies, assuming model correctness. More...

void	SplitSubstatesInGroup (const Vector< BaseFloat > &pdf_occupancies, const Sgmm2SplitSubstatesConfig &opts, const SpMatrix< BaseFloat > &sqrt_H_sm, int32 j1, int32 M)
	Called inside SplitSubstates(); splits substates of one group. More...

void	ComputeNormalizersInternal (int32 num_threads, int32 thread, int32 entropy_count, double entropy_sum)
	Compute a subset of normalizers; used in multi-threaded implementation. More...

void	ComponentLogLikes (const Sgmm2PerFrameDerivedVars &per_frame_vars, int32 j1, Sgmm2PerSpkDerivedVars spk_vars, Matrix< BaseFloat > loglikes) const
	The code below is called internally from LogLikelihood() and ComponentPosteriors(). More...

void	InitializeMw (int32 phn_subspace_dim, const Matrix< BaseFloat > &norm_xform)
	Initializes the matrices M_ and w_. More...

void	InitializeNu (int32 spk_subspace_dim, const Matrix< BaseFloat > &norm_xform, bool speaker_dependent_weights)
	Initializes the matrices N_ and [if speaker_dependent_weights==true] u_. More...

void	InitializeVecsAndSubstateWeights (BaseFloat self_weight)

void	InitializeCovars ()
	initializes the within-class covariances. More...

void	ComputeHsmFromModel (const std::vector< SpMatrix< BaseFloat > > &H, const Vector< BaseFloat > &state_occupancies, SpMatrix< BaseFloat > *H_sm, BaseFloat max_cond) const

void	ComputePdfMappings ()

	KALDI_DISALLOW_COPY_AND_ASSIGN (AmSgmm2)
	maps from each pdf (index j2) to the corresponding group of pdfs (index j1) for SCTM. More...

Friends
class	ComputeNormalizersClass

class	Sgmm2Project

class	EbwAmSgmm2Updater

class	MleAmSgmm2Accs

class	MleAmSgmm2Updater

class	MleSgmm2SpeakerAccs

class	AmSgmm2Functions

class	Sgmm2Feature

Detailed Description

Class for definition of the subspace Gmm acoustic model.

Definition at line 231 of file am-sgmm2.h.

Constructor & Destructor Documentation

◆ AmSgmm2()

AmSgmm2 ( )

inline

Definition at line 233 of file am-sgmm2.h.

233 {}

Member Function Documentation

◆ Check()

void Check ( bool show_properties = true )

Checks the various components for correct sizes.

With wrong sizes, assertion failure occurs. When the argument is set to true, dimensions of the various components are printed.

Definition at line 276 of file am-sgmm2.cc.

References rnnlm::i, KALDI_ASSERT, and KALDI_LOG.

Referenced by TestSgmm2IO(), and TestSgmm2Substates().

                                         {
   int32 J1 = NumGroups(),
       J2 = NumPdfs(),
       num_gauss = NumGauss(),
       feat_dim = FeatureDim(),
       phn_dim = PhoneSpaceDim(),
       spk_dim = SpkSpaceDim();
 
   if (show_properties)
     KALDI_LOG << "AmSgmm2: #pdfs = " << J2 << ", #pdf-groups = "
               << J1 << ", #Gaussians = "
               << num_gauss << ", feature dim = " << feat_dim
               << ", phone-space dim =" << phn_dim
               << ", speaker-space dim =" << spk_dim;
   KALDI_ASSERT(J1 > 0 && num_gauss > 0 && feat_dim > 0 && phn_dim > 0
                && J2 > 0 && J2 >= J1);
 
   std::ostringstream debug_str;
 
   // First check the diagonal-covariance UBM.
   KALDI_ASSERT(diag_ubm_.NumGauss() == num_gauss);
   KALDI_ASSERT(diag_ubm_.Dim() == feat_dim);
 
   // Check the full-covariance UBM.
   KALDI_ASSERT(full_ubm_.NumGauss() == num_gauss);
   KALDI_ASSERT(full_ubm_.Dim() == feat_dim);
 
   // Check the globally-shared covariance matrices.
   KALDI_ASSERT(SigmaInv_.size() == static_cast<size_t>(num_gauss));
   for (int32 i = 0; i < num_gauss; i++) {
     KALDI_ASSERT(SigmaInv_[i].NumRows() == feat_dim &&
                  SigmaInv_[i](0, 0) > 0.0);  // or it wouldn't be +ve definite.
   }
 
   if (spk_dim != 0) {
     KALDI_ASSERT(N_.size() == static_cast<size_t>(num_gauss));
     for (int32 i = 0; i < num_gauss; i++)
       KALDI_ASSERT(N_[i].NumRows() == feat_dim && N_[i].NumCols() == spk_dim);
     if (u_.NumRows() == 0) {
       debug_str << "Speaker-weight projections: no.";
     } else {
       KALDI_ASSERT(u_.NumRows() == num_gauss && u_.NumCols() == spk_dim);
       debug_str << "Speaker-weight projections: yes.";
     }
   } else {
     KALDI_ASSERT(N_.size() == 0 && u_.NumRows() == 0);
   }
 
   KALDI_ASSERT(M_.size() == static_cast<size_t>(num_gauss));
   for (int32 i = 0; i < num_gauss; i++) {
     KALDI_ASSERT(M_[i].NumRows() == feat_dim && M_[i].NumCols() == phn_dim);
   }
 
   KALDI_ASSERT(w_.NumRows() == num_gauss && w_.NumCols() == phn_dim);
 
   {  // check v, c.
     KALDI_ASSERT(v_.size() == static_cast<size_t>(J1) &&
                  c_.size() == static_cast<size_t>(J2));
     int32 nSubstatesTot = 0;
     for (int32 j1 = 0; j1 < J1; j1++) {
       int32 M_j = NumSubstatesForGroup(j1);
       nSubstatesTot += M_j;
       KALDI_ASSERT(M_j > 0 && v_[j1].NumRows() == M_j &&
                    v_[j1].NumCols() == phn_dim);
     }
     debug_str << "Substates: "<< (nSubstatesTot) << ".  ";
     int32 nSubstateWeights = 0;
     for (int32 j2 = 0; j2 < J2; j2++) {
       int32 j1 = Pdf2Group(j2);
       int32 M = NumSubstatesForPdf(j2);
       KALDI_ASSERT(M == NumSubstatesForGroup(j1));
       nSubstateWeights += M;
     }
     KALDI_ASSERT(nSubstateWeights >= nSubstatesTot);
     debug_str << "SubstateWeights: "<< (nSubstateWeights) << ".  ";
   }
 
   // check normalizers.
   if (n_.size() == 0) {
     debug_str << "Normalizers: no.  ";
   } else {
     debug_str << "Normalizers: yes.  ";
     KALDI_ASSERT(n_.size() == static_cast<size_t>(J1));
     for (int32 j1 = 0; j1 < J1; j1++) {
       KALDI_ASSERT(n_[j1].NumRows() == num_gauss &&
                    n_[j1].NumCols() == NumSubstatesForGroup(j1));
     }
   }
 
   // check w_jmi_.
   if (w_jmi_.size() == 0) {
     debug_str << "Computed weights: no.  ";
   } else {
     debug_str << "Computed weights: yes.  ";
     KALDI_ASSERT(w_jmi_.size() == static_cast<size_t>(J1));
     for (int32 j1 = 0; j1 < J1; j1++) {
       KALDI_ASSERT(w_jmi_[j1].NumRows() == NumSubstatesForGroup(j1) &&
                    w_jmi_[j1].NumCols() == num_gauss);
     }
   }
 
   if (show_properties)
     KALDI_LOG << "Subspace GMM model properties: " << debug_str.str();
 }

◆ ComponentLogLikes()

void ComponentLogLikes	(	const Sgmm2PerFrameDerivedVars &	per_frame_vars,
		int32	j1,
		Sgmm2PerSpkDerivedVars *	spk_vars,
		Matrix< BaseFloat > *	loglikes
	)		const

inlineprivate

The code below is called internally from LogLikelihood() and ComponentPosteriors().

It computes the per-Gaussian log-likelihods given each sub-state of the state. Note: the mixture weights are not included at this point.

Definition at line 476 of file am-sgmm2.cc.

References VectorBase< Real >::Add(), VectorBase< Real >::AddMatVec(), VectorBase< Real >::AddVec(), MatrixBase< Real >::AddVecToRows(), VectorBase< Real >::ApplyLog(), Sgmm2PerSpkDerivedVars::b_is, VectorBase< Real >::Dim(), Sgmm2PerFrameDerivedVars::gselect, rnnlm::i, KALDI_ASSERT, kaldi::kNoTrans, Sgmm2PerSpkDerivedVars::log_d_jms, Sgmm2PerFrameDerivedVars::nti, Vector< Real >::Resize(), Matrix< Real >::Resize(), MatrixBase< Real >::Row(), Sgmm2PerSpkDerivedVars::v_s, and Sgmm2PerFrameDerivedVars::zti.

                                                                   {
   const vector<int32> &gselect = per_frame_vars.gselect;
   int32 num_gselect = gselect.size(), num_substates = v_[j1].NumRows();
 
   // Eq.(37): log p(x(t), m, i|j)  [indexed by j, ki]
   // Although the extra memory allocation of storing this as a
   // matrix might seem unnecessary, we save time in the LogSumExp()
   // via more effective pruning.
   loglikes->Resize(num_gselect, num_substates);
   bool speaker_dep_weights =
       (spk_vars->v_s.Dim() != 0 && HasSpeakerDependentWeights());
   if (speaker_dep_weights) {
     KALDI_ASSERT(static_cast<int32>(spk_vars->log_d_jms.size()) == NumGroups());
     KALDI_ASSERT(static_cast<int32>(w_jmi_.size()) == NumGroups() ||
                  "You need to call ComputeWeights().");
   }
   for (int32 ki = 0;  ki < num_gselect; ki++) {
     SubVector<BaseFloat> logp_xi(*loglikes, ki);
     int32 i = gselect[ki];
     // for all substates, compute z_{i}^T v_{jm}
     logp_xi.AddMatVec(1.0, v_[j1], kNoTrans, per_frame_vars.zti.Row(ki), 0.0);
     logp_xi.AddVec(1.0, n_[j1].Row(i));  // for all substates, add n_{jim}
     logp_xi.Add(per_frame_vars.nti(ki));  // for all substates, add n_{i}(t)
   }
   if (speaker_dep_weights) { // [SSGMM]
     Vector<BaseFloat> &log_d = spk_vars->log_d_jms[j1];
     if (log_d.Dim() == 0) { // have not yet cached this quantity.
       log_d.Resize(num_substates);
       log_d.AddMatVec(1.0, w_jmi_[j1], kNoTrans, spk_vars->b_is, 0.0);
       log_d.ApplyLog();
     }
     loglikes->AddVecToRows(-1.0, log_d); // [SSGMM] this is the term
     // - log d_{jm}^{(s)} in the likelihood function [eq. 25 in
     // the techreport]
   }
 }

◆ ComponentPosteriors()

BaseFloat ComponentPosteriors	(	const Sgmm2PerFrameDerivedVars &	per_frame_vars,
		int32	j2,
		Sgmm2PerSpkDerivedVars *	spk_vars,
		Matrix< BaseFloat > *	post
	)		const

Similar to LogLikelihood() function above, but also computes the posterior probabilities for the pre-selected Gaussian components and all substates.

This one doesn't use caching to share computation for the groups of pdfs. [it's less necessary, as most of the time we're doing this from alignments, or lattices that are quite sparse, so we save little by sharing this.]

Definition at line 574 of file am-sgmm2.cc.

References MatrixBase< Real >::Add(), MatrixBase< Real >::ApplyExp(), KALDI_ASSERT, kaldi::Log(), MatrixBase< Real >::Max(), MatrixBase< Real >::MulColsVec(), MatrixBase< Real >::Scale(), and MatrixBase< Real >::Sum().

Referenced by FmllrSgmm2Accs::Accumulate(), MleAmSgmm2Accs::Accumulate(), MleSgmm2SpeakerAccs::Accumulate(), kaldi::AccumulateForUtterance(), and main().

                                                            {
   KALDI_ASSERT(j2 < NumPdfs() && post != NULL);
   int32 j1 = pdf2group_[j2];
   ComponentLogLikes(per_frame_vars, j1, spk_vars, post); // now
   // post is a matrix of log-likelihoods indexed by [gaussian-selection index]
   // [sub-state index].  It doesn't include the sub-state weights,
   // though.
   BaseFloat loglike = post->Max();
   post->Add(-loglike); // get it to nicer numeric range.
   post->ApplyExp(); // so we're dealing with likelihoods (with an arbitrary offset
   // "loglike" removed to make it in a nice numeric range)
   post->MulColsVec(c_[j2]); // include the sub-state weights.
 
   BaseFloat tot_like = post->Sum();
   KALDI_ASSERT(tot_like != 0.0); // note: not valid to have zero weights.
   loglike += Log(tot_like);
   post->Scale(1.0 / tot_like); // so "post" now sums to one, and "loglike"
   // contains the correct log-likelihood of the data given the pdf.
 
   return loglike;
 }

◆ ComputeDerivedVars()

void ComputeDerivedVars ( )

Computes (and initializes if necessary) derived vars...

for now this is just the normalizers "n" and the diagonal UBM, and if we have the "u" matrix set up, also the w_jmi_ quantities.

Definition at line 810 of file am-sgmm2.cc.

Referenced by main(), TestSgmm2AccsIO(), and UnitTestEstimateSgmm2().

                                  {
   if (n_.empty()) ComputeNormalizers();
   if (diag_ubm_.NumGauss() != full_ubm_.NumGauss()
       || diag_ubm_.Dim() != full_ubm_.Dim()) {
     diag_ubm_.CopyFromFullGmm(full_ubm_);
   }
   if (w_jmi_.empty() && HasSpeakerDependentWeights())
     ComputeWeights();
 }

◆ ComputeFmllrPreXform()

void ComputeFmllrPreXform	(	const Vector< BaseFloat > &	pdf_occs,
		Matrix< BaseFloat > *	xform,
		Matrix< BaseFloat > *	inv_xform,
		Vector< BaseFloat > *	diag_mean_scatter
	)		const

Computes the LDA-like pre-transform and its inverse as well as the eigenvalues of the scatter of the means used in FMLLR estimation.

Definition at line 965 of file am-sgmm2.cc.

References SpMatrix< Real >::AddMat2Sp(), MatrixBase< Real >::AddMatMat(), VectorBase< Real >::AddMatVec(), VectorBase< Real >::AddVec(), SpMatrix< Real >::AddVec2(), VectorBase< Real >::ApplyFloor(), VectorBase< Real >::ApplySoftMax(), TpMatrix< Real >::Cholesky(), MatrixBase< Real >::CopyFromMat(), MatrixBase< Real >::CopyFromTp(), VectorBase< Real >::Dim(), SpMatrix< Real >::Eig(), rnnlm::i, TpMatrix< Real >::InvertDouble(), SpMatrix< Real >::InvertDouble(), SpMatrix< Real >::IsDiagonal(), SpMatrix< Real >::IsUnit(), KALDI_ASSERT, KALDI_WARN, kaldi::kNoTrans, kaldi::kSetZero, kaldi::kTrans, kaldi::kUndefined, rnnlm::n, MatrixBase< Real >::Range(), Vector< Real >::Resize(), Matrix< Real >::Resize(), MatrixBase< Real >::Row(), VectorBase< Real >::Scale(), MatrixBase< Real >::SetUnit(), and VectorBase< Real >::Sum().

Referenced by Sgmm2FmllrGlobalParams::Init(), main(), TestSgmm2FmllrAccsIO(), TestSgmm2FmllrSubspace(), and TestSgmm2PreXform().

                                                                               {
   int32 num_pdfs = NumPdfs(),
       num_gauss = NumGauss(),
       dim = FeatureDim();
   KALDI_ASSERT(state_occs.Dim() == num_pdfs);
 
   BaseFloat total_occ = state_occs.Sum();
 
   // Degenerate case: unlikely to ever happen.
   if (total_occ == 0) {
     KALDI_WARN << "Zero probability (computing transform). Using unit "
                << "pre-transform";
     xform->Resize(dim, dim + 1, kUndefined);
     xform->SetUnit();
     inv_xform->Resize(dim, dim + 1, kUndefined);
     inv_xform->SetUnit();
     diag_mean_scatter->Resize(dim, kSetZero);
     return;
   }
 
   // Convert state occupancies to posteriors; Eq. (B.1)
   Vector<BaseFloat> state_posteriors(state_occs);
   state_posteriors.Scale(1/total_occ);
 
   Vector<BaseFloat> mu_jmi(dim), global_mean(dim);
   SpMatrix<BaseFloat> within_class_covar(dim), between_class_covar(dim);
   Vector<BaseFloat> gauss_weight(num_gauss);  // weights for within-class vars.
   Vector<BaseFloat> w_jm(num_gauss);
   for (int32 j1 = 0; j1 < NumGroups(); j1++) {
     const std::vector<int32> &pdfs = group2pdf_[j1];
     int32 M = NumSubstatesForGroup(j1);
     Vector<BaseFloat> substate_weight(M); // total weight for each substate.
     for (size_t i = 0; i < pdfs.size(); i++) {
       int32 j2 = pdfs[i];
       substate_weight.AddVec(state_posteriors(j2), c_[j2]);
     }
     for (int32 m = 0; m < M; m++) {
       BaseFloat this_substate_weight = substate_weight(m);
       // Eq. (7): w_jm = softmax([w_{1}^T ... w_{D}^T] * v_{jm})
       w_jm.AddMatVec(1.0, w_, kNoTrans, v_[j1].Row(m), 0.0);
       w_jm.ApplySoftMax();
 
       for (int32 i = 0; i < num_gauss; i++) {
         BaseFloat weight = this_substate_weight * w_jm(i);
         mu_jmi.AddMatVec(1.0, M_[i], kNoTrans, v_[j1].Row(m), 0.0);  // Eq. (6)
         // Eq. (B.3): \mu_avg = \sum_{jmi} p(j) c_{jm} w_{jmi} \mu_{jmi}
         global_mean.AddVec(weight, mu_jmi);
         // \Sigma_B = \sum_{jmi} p(j) c_{jm} w_{jmi} \mu_{jmi} \mu_{jmi}^T
         between_class_covar.AddVec2(weight, mu_jmi);  // Eq. (B.4)
         gauss_weight(i) += weight;
       }
     }
   }
   between_class_covar.AddVec2(-1.0, global_mean);  // Eq. (B.4)
 
   for (int32 i = 0; i < num_gauss; i++) {
     SpMatrix<BaseFloat> Sigma(SigmaInv_[i]);
     Sigma.InvertDouble();
     // Eq. (B.2): \Sigma_W = \sum_{jmi} p(j) c_{jm} w_{jmi} \Sigma_i
     within_class_covar.AddSp(gauss_weight(i), Sigma);
   }
 
   TpMatrix<BaseFloat> tmpL(dim);
   Matrix<BaseFloat> tmpLInvFull(dim, dim);
   tmpL.Cholesky(within_class_covar);  // \Sigma_W = L L^T
   tmpL.InvertDouble();  // L^{-1}
   tmpLInvFull.CopyFromTp(tmpL);  // get as full matrix.
 
   // B := L^{-1} * \Sigma_B * L^{-T}
   SpMatrix<BaseFloat> tmpB(dim);
   tmpB.AddMat2Sp(1.0, tmpLInvFull, kNoTrans, between_class_covar, 0.0);
 
   Matrix<BaseFloat> U(dim, dim);
   diag_mean_scatter->Resize(dim);
   xform->Resize(dim, dim + 1);
   inv_xform->Resize(dim, dim + 1);
 
   tmpB.Eig(diag_mean_scatter, &U);  // Eq. (B.5): B = U D V^T
 
   int32 n;
   diag_mean_scatter->ApplyFloor(1.0e-04, &n);
   if (n != 0)
     KALDI_WARN << "Floored " << n << " elements of the mean-scatter matrix.";
 
   // Eq. (B.6): A_{pre} = U^T * L^{-1}
   SubMatrix<BaseFloat> Apre(*xform, 0, dim, 0, dim);
   Apre.AddMatMat(1.0, U, kTrans, tmpLInvFull, kNoTrans, 0.0);
 
 #ifdef KALDI_PARANOID
   {
     SpMatrix<BaseFloat> tmp(dim);
     tmp.AddMat2Sp(1.0, Apre, kNoTrans, within_class_covar, 0.0);
     KALDI_ASSERT(tmp.IsUnit(0.01));
   }
   {
     SpMatrix<BaseFloat> tmp(dim);
     tmp.AddMat2Sp(1.0, Apre, kNoTrans, between_class_covar, 0.0);
     KALDI_ASSERT(tmp.IsDiagonal(0.01));
   }
 #endif
 
   // Eq. (B.7): b_{pre} = - A_{pre} \mu_{avg}
   Vector<BaseFloat> b_pre(dim);
   b_pre.AddMatVec(-1.0, Apre, kNoTrans, global_mean, 0.0);
   for (int32 r = 0; r < dim; r++) {
     xform->Row(r)(dim) = b_pre(r);  // W_{pre} = [ A_{pre}, b_{pre} ]
   }
 
   // Eq. (B.8) & (B.9): W_{inv} = [ A_{pre}^{-1}, \mu_{avg} ]
   inv_xform->CopyFromMat(*xform);
   inv_xform->Range(0, dim, 0, dim).InvertDouble();
   for (int32 r = 0; r < dim; r++)
     inv_xform->Row(r)(dim) = global_mean(r);
 }  // End of ComputePreXform()

◆ ComputeGammaI()

void ComputeGammaI	(	const Vector< BaseFloat > &	state_occupancies,
		Vector< BaseFloat > *	gamma_i
	)		const

private

Computes quasi-occupancies gamma_i from the state-level occupancies, assuming model correctness.

Definition at line 50 of file am-sgmm2.cc.

References VectorBase< Real >::AddVec(), VectorBase< Real >::Dim(), rnnlm::i, KALDI_ASSERT, kaldi::kNoTrans, and Vector< Real >::Resize().

                                                               {
   KALDI_ASSERT(state_occupancies.Dim() == NumPdfs());
   Vector<BaseFloat> w_jm(NumGauss());
   gamma_i->Resize(NumGauss());
   for (int32 j1 = 0; j1 < NumGroups(); j1++) {
     int32 M = NumSubstatesForGroup(j1);
     const std::vector<int32> &pdfs = group2pdf_[j1];
     Vector<BaseFloat> substate_weight(M); // total weight for each substate.
     for (size_t i = 0; i < pdfs.size(); i++) {
       int32 j2 = pdfs[i];
       substate_weight.AddVec(state_occupancies(j2), c_[j2]);
     }
     for (int32 m = 0; m < M; m++) {
       w_jm.AddMatVec(1.0, w_, kNoTrans, v_[j1].Row(m), 0.0);
       w_jm.ApplySoftMax();
       gamma_i->AddVec(substate_weight(m), w_jm);
     }
   }
 }

◆ ComputeH()

template void ComputeH ( std::vector< SpMatrix< Real > > * H_i ) const

Computes quantities H = M_i Sigma_i^{-1} M_i^T.

Definition at line 1107 of file am-sgmm2.cc.

References SpMatrix< Real >::AddMat2Sp(), rnnlm::i, KALDI_ASSERT, and kaldi::kTrans.

Referenced by EbwAmSgmm2Updater::Update(), and MleAmSgmm2Updater::Update().

                                                              {
   KALDI_ASSERT(NumGauss() != 0);
   (*H_i).resize(NumGauss());
   SpMatrix<BaseFloat> H_i_tmp(PhoneSpaceDim());
   for (int32 i = 0; i < NumGauss(); i++) {
     (*H_i)[i].Resize(PhoneSpaceDim());
     H_i_tmp.AddMat2Sp(1.0, M_[i], kTrans, SigmaInv_[i], 0.0);
     (*H_i)[i].CopyFromSp(H_i_tmp);
   }
 }

◆ ComputeHsmFromModel()

void ComputeHsmFromModel	(	const std::vector< SpMatrix< BaseFloat > > &	H,
		const Vector< BaseFloat > &	state_occupancies,
		SpMatrix< BaseFloat > *	H_sm,
		BaseFloat	max_cond
	)		const

private

Definition at line 1260 of file am-sgmm2.cc.

References SpMatrix< Real >::AddSp(), VectorBase< Real >::Dim(), rnnlm::i, KALDI_ASSERT, KALDI_LOG, KALDI_WARN, SpMatrix< Real >::LimitCondDouble(), SpMatrix< Real >::Resize(), PackedMatrix< Real >::Scale(), PackedMatrix< Real >::SetUnit(), and PackedMatrix< Real >::SetZero().

                               {
   int32 num_gauss = NumGauss();
   BaseFloat tot_sum = 0.0;
   KALDI_ASSERT(state_occupancies.Dim() == NumPdfs());
   Vector<BaseFloat> w_jm(num_gauss);
   H_sm->Resize(PhoneSpaceDim());
   H_sm->SetZero();
   Vector<BaseFloat> gamma_i;
   ComputeGammaI(state_occupancies, &gamma_i);
 
   BaseFloat sum = 0.0;
   for (int32 i = 0; i < num_gauss; i++) {
     if (gamma_i(i) > 0) {
       H_sm->AddSp(gamma_i(i), H[i]);
       sum += gamma_i(i);
     }
   }
   if (sum == 0.0) {
     KALDI_WARN << "Sum of counts is zero. ";
     // set to unit matrix--arbitrary non-singular matrix.. won't ever matter.
     H_sm->SetUnit();
   } else {
     H_sm->Scale(1.0 / sum);
     int32 tmp = H_sm->LimitCondDouble(max_cond);
     if (tmp > 0) {
       KALDI_WARN << "Limited " << (tmp) << " eigenvalues of H_sm";
     }
   }
   tot_sum += sum;
 
   KALDI_LOG << "total count is " << tot_sum;
 }

◆ ComputeNormalizers()

void ComputeNormalizers ( )

Computes the data-independent terms in the log-likelihood computation for each Gaussian component and all substates.

Eq. (31)

Definition at line 857 of file am-sgmm2.cc.

References kaldi::Exp(), KALDI_LOG, and kaldi::RunMultiThreaded().

Referenced by main(), TestSgmm2Fmllr(), TestSgmm2Init(), TestSgmm2Substates(), UnitTestEstimateSgmm2(), UnitTestSgmm2(), and EbwAmSgmm2Updater::Update().

                                  {
   KALDI_LOG << "Computing normalizers";
   n_.resize(NumPdfs());
   int32 entropy_count = 0;
   double entropy_sum = 0.0;
   ComputeNormalizersClass c(this, &entropy_count, &entropy_sum);
   RunMultiThreaded(c);
 
   KALDI_LOG << "Entropy of weights in substates is "
             << (entropy_sum / entropy_count) << " over " << entropy_count
             << " substates, equivalent to perplexity of "
             << (Exp(entropy_sum /entropy_count));
   KALDI_LOG << "Done computing normalizers";
 }

◆ ComputeNormalizersInternal()

void ComputeNormalizersInternal	(	int32	num_threads,
		int32	thread,
		int32 *	entropy_count,
		double *	entropy_sum
	)

private

Compute a subset of normalizers; used in multi-threaded implementation.

Definition at line 873 of file am-sgmm2.cc.

References MatrixBase< Real >::AddMatMat(), MatrixBase< Real >::AddMatSp(), kaldi::Exp(), rnnlm::i, KALDI_ISFINITE, KALDI_LOG, KALDI_WARN, kaldi::kNoTrans, kaldi::kTrans, kaldi::Log(), M_PI, MatrixBase< Real >::Row(), and kaldi::VecVec().

                                                               {
 
   BaseFloat DLog2pi = FeatureDim() * Log(2 * M_PI);
   Vector<BaseFloat> log_det_Sigma(NumGauss());
 
   for (int32 i = 0; i < NumGauss(); i++) {
     try {
       log_det_Sigma(i) = - SigmaInv_[i].LogPosDefDet();
     } catch(...) {
       if (thread == 0) // just for one thread, print errors [else, duplicates]
         KALDI_WARN << "Covariance is not positive definite, setting to unit";
       SigmaInv_[i].SetUnit();
       log_det_Sigma(i) = 0.0;
     }
   }
 
   int32 J1 = NumGroups();
 
   int block_size = (NumPdfs() + num_threads-1) / num_threads;
   int j_start = thread * block_size, j_end = std::min(J1, j_start + block_size);
 
   int32 I = NumGauss();
   for (int32 j1 = j_start; j1 < j_end; j1++) {
     int32 M = NumSubstatesForGroup(j1);
     Matrix<BaseFloat> log_w_jm(M, I);
     n_[j1].Resize(I, M);
     Matrix<BaseFloat> mu_jmi(M, FeatureDim());
     Matrix<BaseFloat> SigmaInv_mu(M, FeatureDim());
 
     // (in logs): w_jm = softmax([w_{k1}^T ... w_{kD}^T] * v_{jkm}) eq.(7)
     log_w_jm.AddMatMat(1.0, v_[j1], kNoTrans, w_, kTrans, 0.0);
     for (int32 m = 0; m < M; m++) {
       log_w_jm.Row(m).Add(-1.0 * log_w_jm.Row(m).LogSumExp());
       {  // DIAGNOSTIC CODE
         (*entropy_count)++;
         for (int32 i = 0; i < NumGauss(); i++) {
           (*entropy_sum) -= log_w_jm(m, i) * Exp(log_w_jm(m, i));
         }
       }
     }
 
     for (int32 i = 0; i < I; i++) {
       // mu_jmi = M_{i} * v_{jm}
       mu_jmi.AddMatMat(1.0, v_[j1], kNoTrans, M_[i], kTrans, 0.0);
       SigmaInv_mu.AddMatSp(1.0, mu_jmi, kNoTrans, SigmaInv_[i], 0.0);
 
       for (int32 m = 0; m < M; m++) {
         // mu_{jmi} * \Sigma_{i}^{-1} * mu_{jmi}
         BaseFloat mu_SigmaInv_mu = VecVec(mu_jmi.Row(m), SigmaInv_mu.Row(m));
         // Previously had:
         // BaseFloat logc = log(c_[j](m));
         // but because of STCM aspect, we can't include the sub-state mixture weights
         // at this point [included later on.]
 
         // eq.(31)
         n_[j1](i, m) = log_w_jm(m, i) - 0.5 * (log_det_Sigma(i) + DLog2pi
             + mu_SigmaInv_mu);
         {  // Mainly diagnostic code.  Not necessary.
           BaseFloat tmp = n_[j1](i, m);
           if (!KALDI_ISFINITE(tmp)) {  // NaN or inf
             KALDI_LOG << "Warning: normalizer for j1 = " << j1 << ", m = " << m
                       << ", i = " << i << " is infinite or NaN " << tmp << "= "
                       << log_w_jm(m, i) << "+"
                       << (-0.5 * log_det_Sigma(i)) << "+" << (-0.5 * DLog2pi)
                       << "+" << (mu_SigmaInv_mu) << ", setting to finite.";
             n_[j1](i, m) = -1.0e+40;  // future work(arnab): get rid of magic number
           }
         }
       }
     }
   }
 }

◆ ComputePdfMappings()

void ComputePdfMappings ( )

private

Definition at line 72 of file am-sgmm2.cc.

References KALDI_ASSERT, and KALDI_WARN.

                                  {
   if (pdf2group_.empty()) {
     KALDI_WARN << "ComputePdfMappings(): no pdf2group_ map, assuming you "
         "are reading in old model.";
     KALDI_ASSERT(v_.size() != 0);
     pdf2group_.resize(v_.size());
     for (int32 j2 = 0; j2 < static_cast<int32>(pdf2group_.size()); j2++)
       pdf2group_[j2] = j2;
   }
   group2pdf_.clear();
   for (int32 j2 = 0; j2 < static_cast<int32>(pdf2group_.size()); j2++) {
     int32 j1 = pdf2group_[j2];
     if (group2pdf_.size() <= j1) group2pdf_.resize(j1+1);
     group2pdf_[j1].push_back(j2);
   }
 }

◆ ComputePerFrameVars()

void ComputePerFrameVars	(	const VectorBase< BaseFloat > &	data,
		const std::vector< int32 > &	gselect,
		const Sgmm2PerSpkDerivedVars &	spk_vars,
		Sgmm2PerFrameDerivedVars *	per_frame_vars
	)		const

This needs to be called with each new frame of data, prior to accumulation or likelihood evaluation: it computes various pre-computed quantities.

Definition at line 442 of file am-sgmm2.cc.

References VectorBase< Real >::AddSpVec(), Sgmm2PerFrameDerivedVars::gselect, rnnlm::i, KALDI_ASSERT, kaldi::kTrans, Sgmm2PerSpkDerivedVars::log_b_is, Sgmm2PerFrameDerivedVars::nti, Sgmm2PerSpkDerivedVars::o_s, Sgmm2PerFrameDerivedVars::Resize(), MatrixBase< Real >::Row(), Sgmm2PerSpkDerivedVars::v_s, kaldi::VecVec(), Sgmm2PerFrameDerivedVars::xt, Sgmm2PerFrameDerivedVars::xti, and Sgmm2PerFrameDerivedVars::zti.

Referenced by kaldi::AccumulateForUtterance(), DecodableAmSgmm2::LogLikelihoodForPdf(), main(), TestSgmm2AccsIO(), TestSgmm2FmllrAccsIO(), TestSgmm2FmllrSubspace(), TestSgmm2IncreaseDim(), TestSgmm2Init(), TestSgmm2IO(), and TestSgmm2Substates().

                                                                                  {
   KALDI_ASSERT(!n_.empty() && "ComputeNormalizers() must be called.");
 
   per_frame_vars->Resize(gselect.size(), FeatureDim(), PhoneSpaceDim());
 
   per_frame_vars->gselect = gselect;
   per_frame_vars->xt.CopyFromVec(data);
 
   for (int32 ki = 0, last = gselect.size(); ki < last; ki++) {
     int32 i = gselect[ki];
     per_frame_vars->xti.Row(ki).CopyFromVec(per_frame_vars->xt);
     if (spk_vars.v_s.Dim() != 0)
       per_frame_vars->xti.Row(ki).AddVec(-1.0, spk_vars.o_s.Row(i));
   }
   Vector<BaseFloat> SigmaInv_xt(FeatureDim());
 
   bool speaker_dep_weights =
       (spk_vars.v_s.Dim() != 0 && HasSpeakerDependentWeights());
   for (int32 ki = 0, last = gselect.size(); ki < last; ki++) {
     int32 i = gselect[ki];
     BaseFloat ssgmm_term = (speaker_dep_weights ? spk_vars.log_b_is(i) : 0.0);
     SigmaInv_xt.AddSpVec(1.0, SigmaInv_[i], per_frame_vars->xti.Row(ki), 0.0);
     // Eq (35): z_{i}(t) = M_{i}^{T} \Sigma_{i}^{-1} x_{i}(t)
     per_frame_vars->zti.Row(ki).AddMatVec(1.0, M_[i], kTrans, SigmaInv_xt, 0.0);
     // Eq.(36): n_{i}(t) = -0.5 x_{i}^{T} \Sigma_{i}^{-1} x_{i}(t)
     per_frame_vars->nti(ki) = -0.5 * VecVec(per_frame_vars->xti.Row(ki),
                                             SigmaInv_xt) + ssgmm_term;
   }
 }

◆ ComputePerSpkDerivedVars()

void ComputePerSpkDerivedVars ( Sgmm2PerSpkDerivedVars * vars ) const

Computes the per-speaker derived vars; assumes vars->v_s is already set up.

Definition at line 1369 of file am-sgmm2.cc.

References Sgmm2PerSpkDerivedVars::b_is, Sgmm2PerSpkDerivedVars::Clear(), rnnlm::i, KALDI_ASSERT, KALDI_WARN, kaldi::kNoTrans, Sgmm2PerSpkDerivedVars::log_b_is, Sgmm2PerSpkDerivedVars::log_d_jms, Sgmm2PerSpkDerivedVars::o_s, Matrix< Real >::Resize(), MatrixBase< Real >::Row(), and Sgmm2PerSpkDerivedVars::v_s.

Referenced by main(), and kaldi::ProcessUtterance().

                                                                          {
   KALDI_ASSERT(vars != NULL);
   if (vars->v_s.Dim() != 0) {
     KALDI_ASSERT(vars->v_s.Dim() == SpkSpaceDim());
     vars->o_s.Resize(NumGauss(), FeatureDim());
     int32 num_gauss = NumGauss();
     // first compute the o_i^{(s)} quantities.
     for (int32 i = 0; i < num_gauss; i++) {
        // Eqn. (32): o_i^{(s)} = N_i v^{(s)}
       vars->o_s.Row(i).AddMatVec(1.0, N_[i], kNoTrans, vars->v_s, 0.0);
     }
     // the rest relates to the SSGMM.  We only need to to this
     // if we're using speaker-dependent weights.
     if (HasSpeakerDependentWeights()) {
       vars->log_d_jms.clear();
       vars->log_d_jms.resize(NumGroups());
       vars->log_b_is.Resize(NumGauss());
       vars->log_b_is.AddMatVec(1.0, u_, kNoTrans, vars->v_s, 0.0);
       vars->b_is.Resize(NumGauss());
       vars->b_is.CopyFromVec(vars->log_b_is);
       vars->b_is.ApplyExp();
       for (int32 i = 0; i < vars->b_is.Dim(); i++) {
         if (vars->b_is(i) - vars->b_is(i) != 0.0) { // NaN.
           vars->b_is(i) = 1.0;
           KALDI_WARN << "Set NaN in b_is to 1.0";
         }
       }
     } else {
       vars->b_is.Resize(0);
       vars->log_b_is.Resize(0);
       vars->log_d_jms.resize(0);
     }
   } else {
     vars->Clear(); // make sure everything is cleared.
   }
 }

◆ ComputeWeights()

void ComputeWeights ( )

Computes the weights w_jmi_, which is needed for likelihood evaluation with SSGMMs.

Definition at line 796 of file am-sgmm2.cc.

References rnnlm::i, kaldi::kNoTrans, and kaldi::kTrans.

Referenced by TestSgmm2Init(), and TestSgmm2Substates().

                              {
   int32 J1 = NumGroups();
   w_jmi_.resize(J1);
   int32 i = NumGauss();
   for (int32 j1 = 0; j1 < J1; j1++) {
     int32 M = NumSubstatesForGroup(j1);
     w_jmi_[j1].Resize(M, i);
     w_jmi_[j1].AddMatMat(1.0, v_[j1], kNoTrans, w_, kTrans, 0.0);
     // now w_jmi_ contains un-normalized log weights.
     for (int32 m = 0; m < M; m++)
       w_jmi_[j1].Row(m).ApplySoftMax(); // get the actual weights.
   }
 }

◆ CopyFromSgmm2()

void CopyFromSgmm2	(	const AmSgmm2 &	other,
		bool	copy_normalizers,
		bool	copy_weights
	)

Used to copy models (useful in update)

Definition at line 415 of file am-sgmm2.cc.

References AmSgmm2::c_, AmSgmm2::diag_ubm_, AmSgmm2::full_ubm_, AmSgmm2::group2pdf_, KALDI_LOG, AmSgmm2::M_, AmSgmm2::N_, AmSgmm2::n_, AmSgmm2::pdf2group_, AmSgmm2::SigmaInv_, AmSgmm2::u_, AmSgmm2::v_, AmSgmm2::w_, and AmSgmm2::w_jmi_.

Referenced by TestSgmm2AccsIO(), TestSgmm2Init(), and TestSgmm2Substates().

                                              {
   KALDI_LOG << "Copying AmSgmm2";
   pdf2group_ = other.pdf2group_;
   group2pdf_ = other.group2pdf_;
 
   // Copy background GMMs
   diag_ubm_.CopyFromDiagGmm(other.diag_ubm_);
   full_ubm_.CopyFromFullGmm(other.full_ubm_);
 
   // Copy global params
   SigmaInv_ = other.SigmaInv_;
   M_ = other.M_;
   w_ = other.w_;
   N_ = other.N_;
   u_ = other.u_;
 
   // Copy state-specific params, but only copy normalizers if requested.
   v_ = other.v_;
   c_ = other.c_;
   if (copy_normalizers) n_ = other.n_;
   if (copy_weights) w_jmi_ = other.w_jmi_;
 
   KALDI_LOG << "Done.";
 }

◆ CopyGlobalsInitVecs()

void CopyGlobalsInitVecs	(	const AmSgmm2 &	other,
		const std::vector< int32 > &	pdf2group,
		BaseFloat	self_weight
	)

Copies the global parameters from the supplied model, but sets the state vectors to zero.

Definition at line 1183 of file am-sgmm2.cc.

References AmSgmm2::diag_ubm_, AmSgmm2::full_ubm_, KALDI_LOG, AmSgmm2::M_, AmSgmm2::N_, AmSgmm2::SigmaInv_, AmSgmm2::u_, and AmSgmm2::w_.

Referenced by main().

                                                          {
   KALDI_LOG << "Initializing model";
   pdf2group_ = pdf2group;
   ComputePdfMappings();
 
   // Copy background GMMs
   diag_ubm_.CopyFromDiagGmm(other.diag_ubm_);
   full_ubm_.CopyFromFullGmm(other.full_ubm_);
 
   // Copy global params
   SigmaInv_ = other.SigmaInv_;
 
   M_ = other.M_;
   w_ = other.w_;
   u_ = other.u_;
   N_ = other.N_;
 
   InitializeVecsAndSubstateWeights(self_weight);
 }

◆ diag_ubm()

const DiagGmm& diag_ubm ( ) const

inline

Definition at line 379 of file am-sgmm2.h.

References rnnlm::i.

379 { return diag_ubm_; }

kaldi::AmSgmm2::diag_ubm_

DiagGmm diag_ubm_

These contain the "background" model associated with the subspace GMM.

Definition: am-sgmm2.h:413

◆ FeatureDim()

int32 FeatureDim ( ) const

inline

Definition at line 363 of file am-sgmm2.h.

Referenced by FmllrSgmm2Accs::AccumulateForFmllrSubspace(), MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), Sgmm2Project::ApplyProjection(), kaldi::CalcFmllrStepSize(), MleAmSgmm2Accs::Check(), MleAmSgmm2Updater::ComputeMPrior(), Sgmm2Project::ComputeProjection(), FmllrSgmm2Accs::FmllrObjGradient(), main(), MleAmSgmm2Updater::MapUpdateM(), MleAmSgmm2Accs::ResizeAccumulators(), TestSgmm2AccsIO(), TestSgmm2FmllrAccsIO(), TestSgmm2FmllrSubspace(), TestSgmm2IncreaseDim(), TestSgmm2Init(), TestSgmm2IO(), TestSgmm2Substates(), and EbwAmSgmm2Updater::UpdateM().

363 { return M_[0].NumRows(); }

kaldi::AmSgmm2::M_

std::vector< Matrix< BaseFloat > > M_

Phonetic-subspace projections. Dimension is [I][D][S].

Definition: am-sgmm2.h:425

◆ full_ubm()

const FullGmm& full_ubm ( ) const

inline

Accessors.

Definition at line 378 of file am-sgmm2.h.

Referenced by Sgmm2Project::ComputeProjection(), main(), TestSgmm2IncreaseDim(), and TestSgmm2Init().

378 { return full_ubm_; }

kaldi::AmSgmm2::full_ubm_

FullGmm full_ubm_

Definition: am-sgmm2.h:414

◆ GaussianSelection()

BaseFloat GaussianSelection	(	const Sgmm2GselectConfig &	config,
		const VectorBase< BaseFloat > &	data,
		std::vector< int32 > *	gselect
	)		const

Computes the top-scoring Gaussian indices (used for pruning of later stages of computation).

Returns frame log-likelihood given selected Gaussians from full UBM.

Definition at line 1406 of file am-sgmm2.cc.

References VectorBase< Real >::Data(), Sgmm2GselectConfig::diag_gmm_nbest, VectorBase< Real >::Dim(), Sgmm2GselectConfig::full_gmm_nbest, rnnlm::i, and KALDI_ASSERT.

Referenced by main(), TestSgmm2AccsIO(), TestSgmm2FmllrAccsIO(), TestSgmm2FmllrSubspace(), TestSgmm2IncreaseDim(), TestSgmm2Init(), TestSgmm2IO(), and TestSgmm2Substates().

                                                                      {
   KALDI_ASSERT(diag_ubm_.NumGauss() != 0 &&
                diag_ubm_.NumGauss() == full_ubm_.NumGauss() &&
                diag_ubm_.Dim() == data.Dim());
   KALDI_ASSERT(config.diag_gmm_nbest > 0 && config.full_gmm_nbest > 0 &&
                config.full_gmm_nbest < config.diag_gmm_nbest);
   int32 num_gauss = diag_ubm_.NumGauss();
 
   std::vector< std::pair<BaseFloat, int32> > pruned_pairs;
   if (config.diag_gmm_nbest < num_gauss) {    Vector<BaseFloat> loglikes(num_gauss);
     diag_ubm_.LogLikelihoods(data, &loglikes);
     Vector<BaseFloat> loglikes_copy(loglikes);
     BaseFloat *ptr = loglikes_copy.Data();
     std::nth_element(ptr, ptr+num_gauss-config.diag_gmm_nbest, ptr+num_gauss);
     BaseFloat thresh = ptr[num_gauss-config.diag_gmm_nbest];
     for (int32 g = 0; g < num_gauss; g++)
       if (loglikes(g) >= thresh)  // met threshold for diagonal phase.
         pruned_pairs.push_back(
             std::make_pair(full_ubm_.ComponentLogLikelihood(data, g), g));
   } else {
     Vector<BaseFloat> loglikes(num_gauss);
     full_ubm_.LogLikelihoods(data, &loglikes);
     for (int32 g = 0; g < num_gauss; g++)
       pruned_pairs.push_back(std::make_pair(loglikes(g), g));
   }
   KALDI_ASSERT(!pruned_pairs.empty());
   if (pruned_pairs.size() > static_cast<size_t>(config.full_gmm_nbest)) {
     std::nth_element(pruned_pairs.begin(),
                      pruned_pairs.end() - config.full_gmm_nbest,
                      pruned_pairs.end());
     pruned_pairs.erase(pruned_pairs.begin(),
                        pruned_pairs.end() - config.full_gmm_nbest);
   }
   Vector<BaseFloat> loglikes_tmp(pruned_pairs.size());  // for return value.
   KALDI_ASSERT(gselect != NULL);
   gselect->resize(pruned_pairs.size());
   // Make sure pruned Gaussians appear from best to worst.
   std::sort(pruned_pairs.begin(), pruned_pairs.end(),
             std::greater< std::pair<BaseFloat, int32> >());
   for (size_t i = 0; i < pruned_pairs.size(); i++) {
     loglikes_tmp(i) = pruned_pairs[i].first;
     (*gselect)[i] = pruned_pairs[i].second;
   }
   return loglikes_tmp.LogSumExp();
 }

◆ GetDjms()

BaseFloat GetDjms	(	int32	j1,
		int32	m,
		Sgmm2PerSpkDerivedVars *	spk_vars
	)		const

Definition at line 948 of file am-sgmm2.cc.

References Sgmm2PerSpkDerivedVars::b_is, kaldi::Exp(), KALDI_ASSERT, kaldi::kNoTrans, and Sgmm2PerSpkDerivedVars::log_d_jms.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), and MleSgmm2SpeakerAccs::AccumulateFromPosteriors().

                                                                   {
   // This relates to SSGMMs (speaker-dependent weights).
   if (spk_vars->log_d_jms.empty()) return -1; // this would be
   // because we don't have speaker-dependent weights ("u" not set up).
 
   KALDI_ASSERT(!w_jmi_.empty() && "You need to call ComputeWeights() on SGMM.");
   Vector<BaseFloat> &log_d = spk_vars->log_d_jms[j1];
   if (log_d.Dim() == 0) {
     log_d.Resize(NumSubstatesForGroup(j1));
     log_d.AddMatVec(1.0, w_jmi_[j1], kNoTrans, spk_vars->b_is, 0.0);
     log_d.ApplyLog();
   }
   return Exp(log_d(m));
 }

◆ GetInvCovars()

void GetInvCovars	(	int32	gauss_index,
		SpMatrix< Real > *	out
	)		const

inline

Templated accessors (used to accumulate in different precision)

Definition at line 511 of file am-sgmm2.h.

References SpMatrix< Real >::CopyFromSp(), kaldi::kUndefined, and SpMatrix< Real >::Resize().

Referenced by kaldi::CalcFmllrStepSize(), and FmllrSgmm2Accs::FmllrObjGradient().

                                                              {
   out->Resize(SigmaInv_[gauss_index].NumRows(), kUndefined);
   out->CopyFromSp(SigmaInv_[gauss_index]);
 }

◆ GetNtransSigmaInv()

template void GetNtransSigmaInv ( std::vector< Matrix< Real > > * out ) const

Definition at line 1084 of file am-sgmm2.cc.

References MatrixBase< Real >::CopyFromMat(), MatrixBase< Real >::CopyFromSp(), rnnlm::i, KALDI_ASSERT, kaldi::kNoTrans, and kaldi::kTrans.

Referenced by MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs().

                                                                  {
   KALDI_ASSERT(SpkSpaceDim() > 0 &&
       "Cannot compute N^{T} \\Sigma_{i}^{-1} without speaker projections.");
   out->resize(NumGauss());
   Matrix<Real> tmpcov(FeatureDim(), FeatureDim());
   Matrix<Real> tmp_n(FeatureDim(), SpkSpaceDim());
   for (int32 i = 0; i < NumGauss(); i++) {
     tmpcov.CopyFromSp(SigmaInv_[i]);
     tmp_n.CopyFromMat(N_[i]);
     (*out)[i].Resize(SpkSpaceDim(), FeatureDim());
     (*out)[i].AddMatMat(1.0, tmp_n, kTrans, tmpcov, kNoTrans, 0.0);
   }
 }

◆ GetSubstateMean()

void GetSubstateMean	(	int32	j1,
		int32	m,
		int32	i,
		VectorBase< Real > *	mean_out
	)		const

inline

Definition at line 519 of file am-sgmm2.h.

References VectorBase< Real >::CopyFromVec(), VectorBase< Real >::Dim(), KALDI_ASSERT, and kaldi::kNoTrans.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), and MleSgmm2SpeakerAccs::AccumulateFromPosteriors().

                                                                       {
   KALDI_ASSERT(mean_out != NULL);
   KALDI_ASSERT(j1 < NumGroups() && m < NumSubstatesForGroup(j1)
                && i < NumGauss());
   KALDI_ASSERT(mean_out->Dim() == FeatureDim());
   Vector<BaseFloat> mean_tmp(FeatureDim());
   mean_tmp.AddMatVec(1.0, M_[i], kNoTrans, v_[j1].Row(m), 0.0);
   mean_out->CopyFromVec(mean_tmp);
 }

◆ GetSubstateSpeakerMean()

void GetSubstateSpeakerMean	(	int32	j1,
		int32	substate,
		int32	gauss,
		const Sgmm2PerSpkDerivedVars &	spk,
		VectorBase< Real > *	mean_out
	)		const

inline

Definition at line 532 of file am-sgmm2.h.

References VectorBase< Real >::AddVec(), Sgmm2PerSpkDerivedVars::o_s, MatrixBase< Real >::Row(), and Sgmm2PerSpkDerivedVars::v_s.

                                                                              {
   GetSubstateMean(j1, m, i, mean_out);
   if (spk.v_s.Dim() != 0)  // have speaker adaptation...
     mean_out->AddVec(1.0, spk.o_s.Row(i));
 }

◆ GetVarScaledSubstateSpeakerMean()

void GetVarScaledSubstateSpeakerMean	(	int32	j1,
		int32	substate,
		int32	gauss,
		const Sgmm2PerSpkDerivedVars &	spk,
		VectorBase< Real > *	mean_out
	)		const

Definition at line 541 of file am-sgmm2.h.

References kaldi::ComputeFeatureNormalizingTransform(), VectorBase< Real >::CopyFromVec(), and VectorBase< Real >::Dim().

Referenced by FmllrSgmm2Accs::AccumulateFromPosteriors().

                                                                                {
   Vector<BaseFloat> tmp_mean(mean_out->Dim()), tmp_mean2(mean_out->Dim());
   GetSubstateSpeakerMean(j1, m, i, spk, &tmp_mean);
   tmp_mean2.AddSpVec(1.0, SigmaInv_[i], tmp_mean, 0.0);
   mean_out->CopyFromVec(tmp_mean2);
 }

◆ HasSpeakerDependentWeights()

bool HasSpeakerDependentWeights ( ) const

inline

True if doing SSGMM.

Definition at line 366 of file am-sgmm2.h.

Referenced by main(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleAmSgmm2Accs::ResizeAccumulators(), and EbwAmSgmm2Updater::UpdateW().

366 { return (u_.NumRows() != 0); }

kaldi::AmSgmm2::u_

Matrix< BaseFloat > u_

[SSGMM] Speaker-subspace weight projection vectors. Dimension is [I][T]

Definition: am-sgmm2.h:431

kaldi::MatrixBase::NumRows

MatrixIndexT NumRows() const

Returns number of rows (or zero for empty matrix).

Definition: kaldi-matrix.h:64

◆ HasSpeakerSpace()

bool HasSpeakerSpace ( ) const

inline

Definition at line 368 of file am-sgmm2.h.

Referenced by main().

368 { return (!N_.empty()); }

kaldi::AmSgmm2::N_

std::vector< Matrix< BaseFloat > > N_

Speaker-subspace projections. Dimension is [I][D][T].

Definition: am-sgmm2.h:427

◆ IncreasePhoneSpaceDim()

void IncreasePhoneSpaceDim	(	int32	target_dim,
		const Matrix< BaseFloat > &	norm_xform
	)

Functions for increasing the phonetic and speaker space dimensions.

The argument norm_xform is a LDA-like feature normalizing transform, computed by the ComputeFeatureNormalizingTransform function.

Definition at line 699 of file am-sgmm2.cc.

References MatrixBase< Real >::CopyFromMat(), rnnlm::i, KALDI_ASSERT, KALDI_ERR, KALDI_LOG, KALDI_WARN, MatrixBase< Real >::NumCols(), MatrixBase< Real >::NumRows(), MatrixBase< Real >::Range(), and Matrix< Real >::Resize().

Referenced by main().

                                                                         {
   KALDI_ASSERT(!M_.empty());
   int32 initial_dim = PhoneSpaceDim(),
       feat_dim = FeatureDim();
   KALDI_ASSERT(norm_xform.NumRows() == feat_dim);
 
   if (target_dim < initial_dim)
     KALDI_ERR << "You asked to increase phn dim to a value lower than the "
               << " current dimension, " << target_dim << " < " << initial_dim;
 
   if (target_dim > initial_dim + feat_dim) {
     KALDI_WARN << "Cannot increase phone subspace dimensionality from "
                << initial_dim << " to " << target_dim << ", increasing to "
                << initial_dim + feat_dim;
     target_dim = initial_dim + feat_dim;
   }
 
   if (initial_dim < target_dim) {
     Matrix<BaseFloat> tmp_M(feat_dim, initial_dim);
     for (int32 i = 0; i < NumGauss(); i++) {
       tmp_M.CopyFromMat(M_[i]);
       M_[i].Resize(feat_dim, target_dim);
       M_[i].Range(0, feat_dim, 0, tmp_M.NumCols()).CopyFromMat(tmp_M);
       M_[i].Range(0, feat_dim, tmp_M.NumCols(),
           target_dim - tmp_M.NumCols()).CopyFromMat(norm_xform.Range(0,
               feat_dim, 0, target_dim-tmp_M.NumCols()));
     }
     Matrix<BaseFloat> tmp_w = w_;
     w_.Resize(tmp_w.NumRows(), target_dim);
     w_.Range(0, tmp_w.NumRows(), 0, tmp_w.NumCols()).CopyFromMat(tmp_w);
 
     for (int32 j1 = 0; j1 < NumGroups(); j1++) {
       // Resize phonetic-subspce vectors.
       Matrix<BaseFloat> tmp_v_j = v_[j1];
       v_[j1].Resize(tmp_v_j.NumRows(), target_dim);
       v_[j1].Range(0, tmp_v_j.NumRows(), 0, tmp_v_j.NumCols()).CopyFromMat(
           tmp_v_j);
     }
     KALDI_LOG << "Phone subspace dimensionality increased from " <<
         initial_dim << " to " << target_dim;
   } else {
     KALDI_LOG << "Phone subspace dimensionality unchanged, since target " <<
         "dimension (" << target_dim << ") <= initial dimansion (" <<
         initial_dim << ")";
   }
 }

◆ IncreaseSpkSpaceDim()

void IncreaseSpkSpaceDim	(	int32	target_dim,
		const Matrix< BaseFloat > &	norm_xform,
		bool	speaker_dependent_weights
	)

Increase the subspace dimension for speakers.

The boolean "speaker_dependent_weights" argument (for SSGMM) only makes a difference if increasing the subspace dimension from zero.

Definition at line 747 of file am-sgmm2.cc.

References MatrixBase< Real >::CopyFromMat(), rnnlm::i, KALDI_ASSERT, KALDI_ERR, KALDI_LOG, KALDI_WARN, kaldi::kCopyData, MatrixBase< Real >::NumCols(), MatrixBase< Real >::NumRows(), and MatrixBase< Real >::Range().

Referenced by main().

                                                                  {
   int32 initial_dim = SpkSpaceDim(),
       feat_dim = FeatureDim();
   KALDI_ASSERT(norm_xform.NumRows() == feat_dim);
 
   if (N_.size() == 0)
     N_.resize(NumGauss());
 
   if (target_dim < initial_dim)
     KALDI_ERR << "You asked to increase spk dim to a value lower than the "
               << " current dimension, " << target_dim << " < " << initial_dim;
 
   if (target_dim > initial_dim + feat_dim) {
     KALDI_WARN << "Cannot increase speaker subspace dimensionality from "
                << initial_dim << " to " << target_dim << ", increasing to "
                << initial_dim + feat_dim;
     target_dim = initial_dim + feat_dim;
   }
 
   if (initial_dim < target_dim) {
     int32 dim_change = target_dim - initial_dim;
     Matrix<BaseFloat> tmp_N((initial_dim != 0) ? feat_dim : 0,
                             initial_dim);
     for (int32 i = 0; i < NumGauss(); i++) {
       if (initial_dim != 0) tmp_N.CopyFromMat(N_[i]);
       N_[i].Resize(feat_dim, target_dim);
       if (initial_dim != 0) {
         N_[i].Range(0, feat_dim, 0, tmp_N.NumCols()).CopyFromMat(tmp_N);
       }
       N_[i].Range(0, feat_dim, tmp_N.NumCols(), dim_change).CopyFromMat(
           norm_xform.Range(0, feat_dim, 0, dim_change));
     }
     // if we already have speaker-dependent weights or we are increasing
     // spk-dim from zero and are asked to add them...
     if (u_.NumRows() != 0 || (initial_dim == 0 && speaker_dependent_weights))
       u_.Resize(NumGauss(), target_dim, kCopyData); // extend dim of u_i's
     KALDI_LOG << "Speaker subspace dimensionality increased from " <<
         initial_dim << " to " << target_dim;
     if (initial_dim == 0 && speaker_dependent_weights)
       KALDI_LOG << "Added parameters u for speaker-dependent weights.";
   } else {
     KALDI_LOG << "Speaker subspace dimensionality unchanged, since target " <<
         "dimension (" << target_dim << ") <= initial dimansion (" <<
         initial_dim << ")";
   }
 }

◆ InitializeCovars()

void InitializeCovars ( )

private

initializes the within-class covariances.

Definition at line 1248 of file am-sgmm2.cc.

References rnnlm::i.

                                {
   std::vector< SpMatrix<BaseFloat> > &inv_covars(full_ubm_.inv_covars());
   int32 num_gauss = full_ubm_.NumGauss();
   int32 dim = full_ubm_.Dim();
   SigmaInv_.resize(num_gauss);
   for (int32 i = 0; i < num_gauss; i++) {
     SigmaInv_[i].Resize(dim);
     SigmaInv_[i].CopyFromSp(inv_covars[i]);
   }
 }

◆ InitializeFromFullGmm()

void InitializeFromFullGmm	(	const FullGmm &	gmm,
		const std::vector< int32 > &	pdf2group,
		int32	phn_subspace_dim,
		int32	spk_subspace_dim,
		bool	speaker_dependent_weights,
		BaseFloat	self_weight
	)

Initializes the SGMM parameters from a full-covariance UBM.

The state2group vector maps from a state to the corresponding cluster of states [i.e. j2 to j1]. For conventionally structured systems (no 2-level tree), this can just be [ 0 1 ... n-1 ].

Definition at line 381 of file am-sgmm2.cc.

References kaldi::ComputeFeatureNormalizingTransform(), FullGmm::Dim(), KALDI_ASSERT, KALDI_LOG, and KALDI_WARN.

Referenced by main(), TestSgmm2Fmllr(), TestSgmm2Init(), UnitTestEstimateSgmm2(), and UnitTestSgmm2().

                                                            {
   pdf2group_ = pdf2group;
   ComputePdfMappings();
   full_ubm_.CopyFromFullGmm(full_gmm);
   diag_ubm_.CopyFromFullGmm(full_gmm);
   if (phn_subspace_dim < 1 || phn_subspace_dim > full_gmm.Dim() + 1) {
     KALDI_WARN << "Initial phone-subspace dimension must be >= 1, value is "
                << phn_subspace_dim << "; setting to " << full_gmm.Dim() + 1;
     phn_subspace_dim = full_gmm.Dim() + 1;
   }
   KALDI_ASSERT(spk_subspace_dim >= 0);
 
   w_.Resize(0, 0);
   N_.clear();
   c_.clear();
   v_.clear();
   SigmaInv_.clear();
 
   KALDI_LOG << "Initializing model";
   Matrix<BaseFloat> norm_xform;
   ComputeFeatureNormalizingTransform(full_gmm, &norm_xform);
   InitializeMw(phn_subspace_dim, norm_xform);
   if (spk_subspace_dim > 0)
     InitializeNu(spk_subspace_dim, norm_xform, speaker_dependent_weights);
   InitializeVecsAndSubstateWeights(self_weight);
   KALDI_LOG << "Initializing variances";
   InitializeCovars();
 }

◆ InitializeMw()

void InitializeMw	(	int32	phn_subspace_dim,
		const Matrix< BaseFloat > &	norm_xform
	)

private

Initializes the matrices M_ and w_.

Definition at line 1126 of file am-sgmm2.cc.

References MatrixBase< Real >::CopyColFromVec(), rnnlm::i, KALDI_ASSERT, kaldi::kNoTrans, MatrixBase< Real >::NumCols(), MatrixBase< Real >::NumRows(), MatrixBase< Real >::Range(), and Matrix< Real >::Resize().

                                                                 {
   int32 ddim = full_ubm_.Dim();
   KALDI_ASSERT(phn_subspace_dim <= ddim + 1);
   KALDI_ASSERT(phn_subspace_dim <= norm_xform.NumCols() + 1);
   KALDI_ASSERT(ddim <= norm_xform.NumRows());
 
   Vector<BaseFloat> mean(ddim);
   int32 num_gauss = full_ubm_.NumGauss();
   w_.Resize(num_gauss, phn_subspace_dim);
   M_.resize(num_gauss);
   for (int32 i = 0; i < num_gauss; i++) {
     full_ubm_.GetComponentMean(i, &mean);
     Matrix<BaseFloat> &thisM(M_[i]);
     thisM.Resize(ddim, phn_subspace_dim);
     // Eq. (27): M_{i} = [ \bar{\mu}_{i} (J)_{1:D, 1:(S-1)}]
     thisM.CopyColFromVec(mean, 0);
     int32 nonrandom_dim = std::min(phn_subspace_dim - 1, ddim),
         random_dim = phn_subspace_dim - 1 - nonrandom_dim;
     thisM.Range(0, ddim, 1, nonrandom_dim).CopyFromMat(
         norm_xform.Range(0, ddim, 0, nonrandom_dim), kNoTrans);
     // The following extension to the original paper allows us to
     // initialize the model with a larger dimension of phone-subspace vector.
     if (random_dim > 0)
       thisM.Range(0, ddim, nonrandom_dim + 1, random_dim).SetRandn();
   }
 }

◆ InitializeNu()

void InitializeNu	(	int32	spk_subspace_dim,
		const Matrix< BaseFloat > &	norm_xform,
		bool	speaker_dependent_weights
	)

private

Initializes the matrices N_ and [if speaker_dependent_weights==true] u_.

Definition at line 1155 of file am-sgmm2.cc.

References rnnlm::i, kaldi::kNoTrans, and MatrixBase< Real >::Range().

                                                           {
   int32 ddim = full_ubm_.Dim();
 
   int32 num_gauss = full_ubm_.NumGauss();
   N_.resize(num_gauss);
   for (int32 i = 0; i < num_gauss; i++) {
     N_[i].Resize(ddim, spk_subspace_dim);
     // Eq. (28): N_{i} = [ (J)_{1:D, 1:T)}]
 
     int32 nonrandom_dim = std::min(spk_subspace_dim, ddim),
         random_dim = spk_subspace_dim - nonrandom_dim;
 
     N_[i].Range(0, ddim, 0, nonrandom_dim).
         CopyFromMat(norm_xform.Range(0, ddim, 0, nonrandom_dim), kNoTrans);
     // The following extension to the original paper allows us to
     // initialize the model with a larger dimension of speaker-subspace vector.
     if (random_dim > 0)
       N_[i].Range(0, ddim, nonrandom_dim, random_dim).SetRandn();
   }
   if (speaker_dependent_weights) {
     u_.Resize(num_gauss, spk_subspace_dim); // will set to zero.
   } else {
     u_.Resize(0, 0);
   }
 }

◆ InitializeVecsAndSubstateWeights()

void InitializeVecsAndSubstateWeights ( BaseFloat self_weight )

private

Definition at line 1207 of file am-sgmm2.cc.

References KALDI_ASSERT.

                                                                     {
   int32 J1 = NumGroups(), J2 = NumPdfs();
   KALDI_ASSERT(J1 > 0 && J2 >= J1);
   int32 phn_subspace_dim = PhoneSpaceDim();
   KALDI_ASSERT(phn_subspace_dim > 0 && "Initialize M and w first.");
 
   v_.resize(J1);
   if (self_weight == 1.0) {
     for (int32 j1 = 0; j1 < J1; j1++) {
       v_[j1].Resize(1, phn_subspace_dim);
       v_[j1](0, 0) = 1.0;  // Eq. (26): v_{j1} = [1 0 0 ... 0]
     }
     c_.resize(J2);
     for (int32 j2 = 0; j2 < J2; j2++) {
       c_[j2].Resize(1);
       c_[j2](0) = 1.0;    // Eq. (25): c_{j1} = 1.0
     }
   } else {
     for (int32 j1 = 0; j1 < J1; j1++) {
       int32 npdfs = group2pdf_[j1].size();
       v_[j1].Resize(npdfs, phn_subspace_dim);
       for (int32 m = 0; m < npdfs; m++)
         v_[j1](m, 0) = 1.0;  // Eq. (26): v_{j1} = [1 0 0 ... 0]
     }
     c_.resize(J2);
     for (int32 j2 = 0; j2 < J2; j2++) {
       int32 j1 = pdf2group_[j2], npdfs = group2pdf_[j1].size();
       c_[j2].Resize(npdfs);
       if (npdfs == 1) c_[j2].Set(1.0);
       else {
         // note: just avoid NaNs if npdfs-1... value won't matter.
         double other_weight = (1.0 - self_weight) / std::max((1-npdfs), 1);
         c_[j2].Set(other_weight);
         for (int32 k = 0; k < npdfs; k++)
           if(group2pdf_[j1][k] == j2) c_[j2](k) = self_weight;
       }
     }
   }
 }

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( AmSgmm2 )

private

maps from each pdf (index j2) to the corresponding group of pdfs (index j1) for SCTM.

◆ LogLikelihood()

BaseFloat LogLikelihood	(	const Sgmm2PerFrameDerivedVars &	per_frame_vars,
		int32	j2,
		Sgmm2LikelihoodCache *	cache,
		Sgmm2PerSpkDerivedVars *	spk_vars,
		BaseFloat	log_prune = `0.0`
	)		const

This does a likelihood computation for a given state using the pre-selected Gaussian components (in per_frame_vars).

If the log_prune parameter is nonzero (e.g. 5.0), the LogSumExp() stage is pruned, which is a significant speedup... smaller values are faster. Note: you have to call cache->NextFrame() before calling this for a new frame of data.

Definition at line 517 of file am-sgmm2.cc.

Referenced by DecodableAmSgmm2::LogLikelihoodForPdf(), TestSgmm2AccsIO(), TestSgmm2FmllrAccsIO(), TestSgmm2IncreaseDim(), TestSgmm2Init(), TestSgmm2IO(), and TestSgmm2Substates().

                                                            {
   int32 t = cache->t; // not a real time; used to uniquely identify frames.
   // Forgo asserts here, as this is frequently called.
   // We'll probably get a segfault if an error is made.
   Sgmm2LikelihoodCache::PdfCacheElement &pdf_cache =
       cache->pdf_cache[j2];
 #ifdef KALDI_PARANOID
   bool random_test = (Rand() % 1000 == 1); // to check that the user is
   // calling Next() on the cache, as they should.
 #else
   bool random_test = false; // compiler will ignore test branches.
 #endif
   if (pdf_cache.t == t) {
     if (!random_test) return pdf_cache.log_like;
   } else {
     random_test = false;
   }
   // if random_test == true at this point, it was already cached, and we will
   // verify that we return the same value as the cached one.
   pdf_cache.t = t;
 
   int32 j1 = pdf2group_[j2];
   Sgmm2LikelihoodCache::SubstateCacheElement &substate_cache =
       cache->substate_cache[j1];
   if (substate_cache.t != t) { // Need to compute sub-state likelihoods.
     substate_cache.t = t;
     Matrix<BaseFloat> loglikes; // indexed [gselect-index][substate-index]
     ComponentLogLikes(per_frame_vars, j1, spk_vars, &loglikes);
     BaseFloat max = loglikes.Max(); // use this to keep things in good numerical range.
     loglikes.Add(-max);
     loglikes.ApplyExp();
     substate_cache.remaining_log_like = max;
     int32 num_substates = loglikes.NumCols();
     substate_cache.likes.Resize(num_substates); // zeroes it.
     substate_cache.likes.AddRowSumMat(1.0, loglikes); // add likelihoods [not in log!] for
     // each column [i.e. summing over the rows], so we get the sum for
     // each substate index.  You have to multiply by exp(remaining_log_like)
     // to get a real likelihood.
   }
 
   BaseFloat log_like = substate_cache.remaining_log_like
       + Log(VecVec(substate_cache.likes, c_[j2]));
 
   if (random_test)
     KALDI_ASSERT(ApproxEqual(pdf_cache.log_like, log_like));
 
   pdf_cache.log_like = log_like;
   KALDI_ASSERT(log_like == log_like && log_like - log_like == 0); // check
   // that it's not NaN or infinity.
   return log_like;
 }

◆ NumGauss()

int32 NumGauss ( ) const

inline

Definition at line 360 of file am-sgmm2.h.

Referenced by Sgmm2Project::ApplyProjection(), kaldi::CalcFmllrStepSize(), MleAmSgmm2Accs::Check(), MleAmSgmm2Updater::ComputeMPrior(), FmllrSgmm2Accs::FmllrObjGradient(), main(), MleAmSgmm2Updater::MapUpdateM(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleAmSgmm2Accs::ResizeAccumulators(), TestSgmm2AccsIO(), TestSgmm2FmllrAccsIO(), TestSgmm2FmllrSubspace(), TestSgmm2IncreaseDim(), TestSgmm2Init(), TestSgmm2IO(), TestSgmm2Substates(), and EbwAmSgmm2Updater::UpdateM().

360 { return M_.size(); }

kaldi::AmSgmm2::M_

std::vector< Matrix< BaseFloat > > M_

Phonetic-subspace projections. Dimension is [I][D][S].

Definition: am-sgmm2.h:425

◆ NumGroups()

int32 NumGroups ( ) const

inline

Definition at line 351 of file am-sgmm2.h.

Referenced by MleAmSgmm2Accs::Check(), main(), MleAmSgmm2Accs::ResizeAccumulators(), TestSgmm2AccsIO(), TestSgmm2FmllrAccsIO(), TestSgmm2IncreaseDim(), TestSgmm2Init(), TestSgmm2IO(), and TestSgmm2Substates().

351 { return group2pdf_.size(); } // relates to SCTM. # pdf groups,

kaldi::AmSgmm2::group2pdf_

std::vector< std::vector< int32 > > group2pdf_

Definition: am-sgmm2.h:410

◆ NumPdfs()

int32 NumPdfs ( ) const

inline

Various model dimensions.

Definition at line 350 of file am-sgmm2.h.

Referenced by MleAmSgmm2Accs::Check(), main(), MleAmSgmm2Accs::ResizeAccumulators(), TestSgmm2AccsIO(), TestSgmm2FmllrAccsIO(), TestSgmm2FmllrSubspace(), TestSgmm2IncreaseDim(), TestSgmm2Init(), TestSgmm2IO(), TestSgmm2PreXform(), and TestSgmm2Substates().

350 { return pdf2group_.size(); }

kaldi::AmSgmm2::pdf2group_

std::vector< int32 > pdf2group_

Definition: am-sgmm2.h:409

◆ NumSubstatesForGroup()

int32 NumSubstatesForGroup ( int32 j1 ) const

inline

Definition at line 357 of file am-sgmm2.h.

References KALDI_ASSERT.

Referenced by FmllrSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), MleAmSgmm2Updater::ComputeQ(), MleAmSgmm2Updater::ComputeSMeans(), main(), MleAmSgmm2Updater::RenormalizeV(), MleAmSgmm2Accs::ResizeAccumulators(), EbwAmSgmm2Updater::UpdatePhoneVectorsInternal(), MleAmSgmm2Updater::UpdatePhoneVectorsInternal(), MleAmSgmm2Updater::UpdateW(), and MleAmSgmm2Updater::UpdateWGetStats().

                                              {
     KALDI_ASSERT(j1 < NumGroups()); return v_[j1].NumRows();
   }

◆ NumSubstatesForPdf()

int32 NumSubstatesForPdf ( int32 j2 ) const

inline

Definition at line 354 of file am-sgmm2.h.

References KALDI_ASSERT.

Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), main(), MleAmSgmm2Accs::ResizeAccumulators(), EbwAmSgmm2Updater::UpdateSubstateWeights(), and MleAmSgmm2Updater::UpdateSubstateWeights().

                                            {
     KALDI_ASSERT(j2 < NumPdfs()); return c_[j2].Dim();
   }

◆ Pdf2Group()

int32 Pdf2Group ( int32 j2 ) const

Definition at line 196 of file am-sgmm2.cc.

References KALDI_ASSERT.

Referenced by FmllrSgmm2Accs::AccumulateFromPosteriors(), MleAmSgmm2Accs::AccumulateFromPosteriors(), MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), and TestSgmm2Init().

                                        {
   KALDI_ASSERT(static_cast<size_t>(j2) < pdf2group_.size());
   int32 j1 = pdf2group_[j2];
   return j1;
 }

◆ PhoneSpaceDim()

int32 PhoneSpaceDim ( ) const

inline

Definition at line 361 of file am-sgmm2.h.

Referenced by MleAmSgmm2Accs::Check(), MleAmSgmm2Updater::ComputeMPrior(), main(), MleAmSgmm2Updater::MapUpdateM(), MleAmSgmm2Accs::ResizeAccumulators(), TestSgmm2AccsIO(), TestSgmm2FmllrAccsIO(), TestSgmm2FmllrSubspace(), TestSgmm2IncreaseDim(), TestSgmm2Init(), TestSgmm2IO(), and EbwAmSgmm2Updater::UpdateM().

361 { return w_.NumCols(); }

kaldi::MatrixBase::NumCols

MatrixIndexT NumCols() const

Returns number of columns (or zero for empty matrix).

Definition: kaldi-matrix.h:67

kaldi::AmSgmm2::w_

Matrix< BaseFloat > w_

Phonetic-subspace weight projection vectors. Dimension is [I][S].

Definition: am-sgmm2.h:429

◆ Read()

void Read	(	std::istream &	is,
		bool	binary
	)

Definition at line 89 of file am-sgmm2.cc.

References kaldi::ExpectToken(), rnnlm::i, KALDI_ASSERT, KALDI_ERR, KALDI_WARN, kaldi::ReadBasicType(), kaldi::ReadIntegerVector(), and kaldi::ReadToken().

Referenced by main(), and TestSgmm2IO().

                                                      {
   { // We want this to work even if the object was previously
     // populated, so we clear the items that are more likely
     // to cause problems.
     pdf2group_.clear();
     group2pdf_.clear();
     u_.Resize(0,0);
     w_jmi_.clear();
     v_.clear();
   }
   // removing anything that was in the object before.
   int32 num_pdfs = -1, feat_dim, num_gauss;
   std::string token;
 
   ExpectToken(in_stream, binary, "<SGMM>");
   ExpectToken(in_stream, binary, "<NUMSTATES>");
   ReadBasicType(in_stream, binary, &num_pdfs);
   ExpectToken(in_stream, binary, "<DIMENSION>");
   ReadBasicType(in_stream, binary, &feat_dim);
   ExpectToken(in_stream, binary, "<NUMGAUSS>");
   ReadBasicType(in_stream, binary, &num_gauss);
 
   KALDI_ASSERT(num_pdfs > 0 && feat_dim > 0);
 
   ReadToken(in_stream, binary, &token);
 
   while (token != "</SGMM>") {
     if (token == "<PDF2GROUP>") {
       ReadIntegerVector(in_stream, binary, &pdf2group_);
       ComputePdfMappings();
     } else if (token == "<WEIGHTIDX2GAUSS>") {  // TEMP!   Will remove.
       std::vector<int32> garbage;
       ReadIntegerVector(in_stream, binary, &garbage);
     } else if (token == "<DIAG_UBM>") {
       diag_ubm_.Read(in_stream, binary);
     } else if (token == "<FULL_UBM>") {
       full_ubm_.Read(in_stream, binary);
     } else if (token == "<SigmaInv>") {
       SigmaInv_.resize(num_gauss);
       for (int32 i = 0; i < num_gauss; i++) {
         SigmaInv_[i].Read(in_stream, binary);
       }
     } else if (token == "<M>") {
       M_.resize(num_gauss);
       for (int32 i = 0; i < num_gauss; i++) {
         M_[i].Read(in_stream, binary);
       }
     } else if (token == "<N>") {
       N_.resize(num_gauss);
       for (int32 i = 0; i < num_gauss; i++) {
         N_[i].Read(in_stream, binary);
       }
     } else if (token == "<w>") {
       w_.Read(in_stream, binary);
     } else if (token == "<u>") {
       u_.Read(in_stream, binary);
     } else if (token == "<v>") {
       int32 num_groups = group2pdf_.size();
       if (num_groups == 0) {
         KALDI_WARN << "Reading old model with new code (should still work)";
         num_groups = num_pdfs;
       }
       v_.resize(num_groups);
       for (int32 j1 = 0; j1 < num_groups; j1++) {
         v_[j1].Read(in_stream, binary);
       }
     } else if (token == "<c>") {
       c_.resize(num_pdfs);
       for (int32 j2 = 0; j2 < num_pdfs; j2++) {
         c_[j2].Read(in_stream, binary);
       }
     } else if (token == "<n>") {
       int32 num_groups = group2pdf_.size();
       if (num_groups == 0) num_groups = num_pdfs;
       n_.resize(num_groups);
       for (int32 j1 = 0; j1 < num_groups; j1++) {
         n_[j1].Read(in_stream, binary);
       }
       // The following are the Gaussian prior parameters for MAP adaptation of M
       // They may be moved to somewhere else eventually.
     } else if (token == "<M_Prior>") {
       ExpectToken(in_stream, binary, "<NUMGaussians>");
       ReadBasicType(in_stream, binary, &num_gauss);
       M_prior_.resize(num_gauss);
       for (int32 i = 0; i < num_gauss; i++) {
         M_prior_[i].Read(in_stream, binary);
       }
     } else if (token == "<Row_Cov_Inv>") {
       row_cov_inv_.Read(in_stream, binary);
     } else if (token == "<Col_Cov_Inv>") {
       col_cov_inv_.Read(in_stream, binary);
     } else {
       KALDI_ERR << "Unexpected token '" << token << "' in model file ";
     }
     ReadToken(in_stream, binary, &token);
   }
 
   if (pdf2group_.empty())
     ComputePdfMappings(); // sets up group2pdf_, and pdf2group_ if reading
   // old model.
 
   if (n_.empty())
     ComputeNormalizers();
   if (HasSpeakerDependentWeights())
     ComputeWeights();
 }

◆ RemoveSpeakerSpace()

void RemoveSpeakerSpace ( )

inline

Definition at line 370 of file am-sgmm2.h.

Referenced by main().

370 { N_.clear(); u_.Resize(0, 0); w_jmi_.clear(); }

kaldi::AmSgmm2::u_

Matrix< BaseFloat > u_

[SSGMM] Speaker-subspace weight projection vectors. Dimension is [I][T]

Definition: am-sgmm2.h:431

kaldi::AmSgmm2::N_

std::vector< Matrix< BaseFloat > > N_

Speaker-subspace projections. Dimension is [I][D][T].

Definition: am-sgmm2.h:427

kaldi::Matrix::Resize

void Resize(const MatrixIndexT r, const MatrixIndexT c, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)

Sets matrix to a specified size (zero is OK as long as both r and c are zero).

Definition: kaldi-matrix.cc:819

kaldi::AmSgmm2::w_jmi_

std::vector< Matrix< BaseFloat > > w_jmi_

[SSGMM] w_{jmi}, dimension is [J1][#mix][I]. Computed from w_ and v_.

Definition: am-sgmm2.h:442

◆ SpkSpaceDim()

int32 SpkSpaceDim ( ) const

inline

Definition at line 362 of file am-sgmm2.h.

Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), MleAmSgmm2Accs::Check(), main(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleAmSgmm2Accs::ResizeAccumulators(), and TestSgmm2Init().

362 { return (N_.size() > 0) ? N_[0].NumCols() : 0; }

kaldi::AmSgmm2::N_

std::vector< Matrix< BaseFloat > > N_

Speaker-subspace projections. Dimension is [I][D][T].

Definition: am-sgmm2.h:427

◆ SplitSubstates()

void SplitSubstates	(	const Vector< BaseFloat > &	state_occupancies,
		const Sgmm2SplitSubstatesConfig &	config
	)

Increases the total number of substates based on the state occupancies.

Definition at line 657 of file am-sgmm2.cc.

References SpMatrix< Real >::ApplyPow(), VectorBase< Real >::Dim(), kaldi::GetSplitTargets(), KALDI_ASSERT, KALDI_LOG, Sgmm2SplitSubstatesConfig::max_cond, Sgmm2SplitSubstatesConfig::min_count, Sgmm2SplitSubstatesConfig::power, and Sgmm2SplitSubstatesConfig::split_substates.

Referenced by main(), and TestSgmm2Substates().

                                                                     {
   KALDI_ASSERT(pdf_occupancies.Dim() == NumPdfs());
   int32 J1 = NumGroups(), J2 = NumPdfs();
   Vector<BaseFloat> group_occupancies(J1);
   for (int32 j2 = 0; j2 < J2; j2++)
     group_occupancies(Pdf2Group(j2)) += pdf_occupancies(j2);
 
   vector<int32> tgt_num_substates;
 
   GetSplitTargets(group_occupancies, opts.split_substates,
                   opts.power, opts.min_count, &tgt_num_substates);
 
   int32 tot_num_substates_old = 0, tot_num_substates_new = 0;
   vector< SpMatrix<BaseFloat> > H_i;
   SpMatrix<BaseFloat> sqrt_H_sm;
 
   ComputeH(&H_i);  // set up that array.
   ComputeHsmFromModel(H_i, pdf_occupancies, &sqrt_H_sm, opts.max_cond);
   H_i.clear();
   sqrt_H_sm.ApplyPow(-0.5);
 
   for (int32 j1 = 0; j1 < J1; j1++) {
     int32 cur_M = NumSubstatesForGroup(j1),
         tgt_M = tgt_num_substates[j1];
     tot_num_substates_old += cur_M;
     tot_num_substates_new += std::max(cur_M, tgt_M);
     if (cur_M < tgt_M)
       SplitSubstatesInGroup(pdf_occupancies, opts, sqrt_H_sm, j1, tgt_M);
   }
   if (tot_num_substates_old == tot_num_substates_new) {
     KALDI_LOG << "Not splitting substates; current #substates is "
               << tot_num_substates_old << " and target is "
               << opts.split_substates;
   } else {
     KALDI_LOG << "Getting rid of normalizers as they will no longer be valid";
     n_.clear();
     KALDI_LOG << "Split " << tot_num_substates_old << " substates to "
               << tot_num_substates_new;
   }
 }

◆ SplitSubstatesInGroup()

void SplitSubstatesInGroup	(	const Vector< BaseFloat > &	pdf_occupancies,
		const Sgmm2SplitSubstatesConfig &	opts,
		const SpMatrix< BaseFloat > &	sqrt_H_sm,
		int32	j1,
		int32	M
	)

private

Called inside SplitSubstates(); splits substates of one group.

Definition at line 599 of file am-sgmm2.cc.

References kaldi::_RandGauss(), VectorBase< Real >::AddRowSumMat(), VectorBase< Real >::AddSpVec(), VectorBase< Real >::Data(), rnnlm::i, KALDI_ASSERT, Sgmm2SplitSubstatesConfig::perturb_factor, and MatrixBase< Real >::Row().

                                                  {
   const std::vector<int32> &pdfs = group2pdf_[j1];
   int32 phn_dim = PhoneSpaceDim(), cur_M = NumSubstatesForGroup(j1),
       num_pdfs_for_group = pdfs.size();
   Vector<BaseFloat> rand_vec(phn_dim), v_shift(phn_dim);
 
   KALDI_ASSERT(tgt_M >= cur_M);
   if (cur_M == tgt_M) return;
   // Resize v[j1] to fit new substates
   {
     Matrix<BaseFloat> tmp_v_j(v_[j1]);
     v_[j1].Resize(tgt_M, phn_dim);
     v_[j1].Range(0, cur_M, 0, phn_dim).CopyFromMat(tmp_v_j);
   }
 
   // we'll use a temporary matrix for the c quantities.
   Matrix<BaseFloat> c_j(num_pdfs_for_group, tgt_M);
   for (int32 i = 0; i < num_pdfs_for_group; i++) {
     int32 j2 = pdfs[i];
     c_j.Row(i).Range(0, cur_M).CopyFromVec(c_[j2]);
   }
 
   // Keep splitting substates until obtaining the desired number
   for (; cur_M < tgt_M; cur_M++) {
     int32 split_m; // substate to split.
     {
       Vector<BaseFloat> substate_count(tgt_M);
       substate_count.AddRowSumMat(1.0, c_j);
       BaseFloat *data = substate_count.Data();
       split_m = std::max_element(data, data+cur_M) - data;
     }
     for (int32 i = 0; i < num_pdfs_for_group; i++) { // divide count of split
       // substate. [extended for SCTM]
       // c_{jkm} := c_{jmk}' := c_{jkm} / 2
       c_j(i, split_m) = c_j(i, cur_M) = c_j(i, split_m) / 2;
     }
     // v_{jkm} := +/- split_perturb * H_k^{(sm)}^{-0.5} * rand_vec
     std::generate(rand_vec.Data(), rand_vec.Data() + rand_vec.Dim(),
                   _RandGauss);
     v_shift.AddSpVec(opts.perturb_factor, sqrt_H_sm, rand_vec, 0.0);
     v_[j1].Row(cur_M).CopyFromVec(v_[j1].Row(split_m));
     v_[j1].Row(cur_M).AddVec(1.0, v_shift);
     v_[j1].Row(split_m).AddVec(-1.0, v_shift);
   }
   // copy the temporary matrix for the c_ (sub-state weight)
   // quantities back to the place it belongs.
   for (int32 i = 0; i < num_pdfs_for_group; i++) {
     int32 j2 = pdfs[i];
     c_[j2].Resize(tgt_M);
     c_[j2].CopyFromVec(c_j.Row(i));
   }
 }

◆ Write()

void Write	(	std::ostream &	os,
		bool	binary,
		SgmmWriteFlagsType	write_params
	)		const

Definition at line 203 of file am-sgmm2.cc.

References rnnlm::i, KALDI_WARN, kaldi::kSgmmBackgroundGmms, kaldi::kSgmmGlobalParams, kaldi::kSgmmNormalizers, kaldi::kSgmmStateParams, kaldi::WriteBasicType(), kaldi::WriteIntegerVector(), and kaldi::WriteToken().

Referenced by main(), TestSgmm2AccsIO(), and TestSgmm2IO().

                                                           {
   int32 num_pdfs = NumPdfs(),
       feat_dim = FeatureDim(),
       num_gauss = NumGauss();
 
   WriteToken(out_stream, binary, "<SGMM>");
   if (!binary) out_stream << "\n";
   WriteToken(out_stream, binary, "<NUMSTATES>");
   WriteBasicType(out_stream, binary, num_pdfs);
   WriteToken(out_stream, binary, "<DIMENSION>");
   WriteBasicType(out_stream, binary, feat_dim);
   WriteToken(out_stream, binary, "<NUMGAUSS>");
   WriteBasicType(out_stream, binary, num_gauss);
   if (!binary) out_stream << "\n";
 
   if (write_params & kSgmmBackgroundGmms) {
     WriteToken(out_stream, binary, "<DIAG_UBM>");
     diag_ubm_.Write(out_stream, binary);
     WriteToken(out_stream, binary, "<FULL_UBM>");
     full_ubm_.Write(out_stream, binary);
   }
 
   if (write_params & kSgmmGlobalParams) {
     WriteToken(out_stream, binary, "<SigmaInv>");
     if (!binary) out_stream << "\n";
     for (int32 i = 0; i < num_gauss; i++) {
       SigmaInv_[i].Write(out_stream, binary);
     }
     WriteToken(out_stream, binary, "<M>");
     if (!binary) out_stream << "\n";
     for (int32 i = 0; i < num_gauss; i++) {
       M_[i].Write(out_stream, binary);
     }
     if (N_.size() != 0) {
       WriteToken(out_stream, binary, "<N>");
       if (!binary) out_stream << "\n";
       for (int32 i = 0; i < num_gauss; i++) {
         N_[i].Write(out_stream, binary);
       }
     }
     WriteToken(out_stream, binary, "<w>");
     w_.Write(out_stream, binary);
     WriteToken(out_stream, binary, "<u>");
     u_.Write(out_stream, binary);
   }
 
   if (write_params & kSgmmStateParams) {
     WriteToken(out_stream, binary, "<PDF2GROUP>");
     WriteIntegerVector(out_stream, binary, pdf2group_);
     WriteToken(out_stream, binary, "<v>");
     for (int32 j1 = 0; j1 < NumGroups(); j1++) {
       v_[j1].Write(out_stream, binary);
     }
     WriteToken(out_stream, binary, "<c>");
     for (int32 j2 = 0; j2 < num_pdfs; j2++) {
       c_[j2].Write(out_stream, binary);
     }
   }
 
   if (write_params & kSgmmNormalizers) {
     WriteToken(out_stream, binary, "<n>");
     if (n_.empty())
       KALDI_WARN << "Not writing normalizers since they are not present.";
     else
       for (int32 j1 = 0; j1 < NumGroups(); j1++)
         n_[j1].Write(out_stream, binary);
   }
   WriteToken(out_stream, binary, "</SGMM>");
 }

Friends And Related Function Documentation

◆ AmSgmm2Functions

friend class AmSgmm2Functions

friend

Definition at line 506 of file am-sgmm2.h.

◆ ComputeNormalizersClass

friend class ComputeNormalizersClass

friend

Definition at line 500 of file am-sgmm2.h.

◆ EbwAmSgmm2Updater

friend class EbwAmSgmm2Updater

friend

Definition at line 502 of file am-sgmm2.h.

◆ MleAmSgmm2Accs

friend class MleAmSgmm2Accs

friend

Definition at line 503 of file am-sgmm2.h.

◆ MleAmSgmm2Updater

friend class MleAmSgmm2Updater

friend

Definition at line 504 of file am-sgmm2.h.

◆ MleSgmm2SpeakerAccs

friend class MleSgmm2SpeakerAccs

friend

Definition at line 505 of file am-sgmm2.h.

◆ Sgmm2Feature

friend class Sgmm2Feature

friend

Definition at line 507 of file am-sgmm2.h.

◆ Sgmm2Project

friend class Sgmm2Project

friend

Definition at line 501 of file am-sgmm2.h.

Member Data Documentation

◆ c_

std::vector< Vector<BaseFloat> > c_

protected

c_{jm}, mixture weights. Dimension is [J2][#mix]

Definition at line 438 of file am-sgmm2.h.

Referenced by AmSgmm2::CopyFromSgmm2(), EbwAmSgmm2Updater::UpdateSubstateWeights(), and MleAmSgmm2Updater::UpdateSubstateWeights().

◆ col_cov_inv_

SpMatrix<BaseFloat> col_cov_inv_

protected

Definition at line 451 of file am-sgmm2.h.

Referenced by MleAmSgmm2Updater::ComputeMPrior(), and MleAmSgmm2Updater::MapUpdateM().

◆ diag_ubm_

DiagGmm diag_ubm_

protected

These contain the "background" model associated with the subspace GMM.

Definition at line 413 of file am-sgmm2.h.

Referenced by Sgmm2Project::ApplyProjection(), AmSgmm2::CopyFromSgmm2(), and AmSgmm2::CopyGlobalsInitVecs().

◆ full_ubm_

FullGmm full_ubm_

protected

Definition at line 414 of file am-sgmm2.h.

Referenced by Sgmm2Project::ApplyProjection(), AmSgmm2::CopyFromSgmm2(), and AmSgmm2::CopyGlobalsInitVecs().

◆ group2pdf_

std::vector<std::vector<int32> > group2pdf_

protected

Definition at line 410 of file am-sgmm2.h.

Referenced by AmSgmm2::CopyFromSgmm2().

◆ M_

std::vector< Matrix<BaseFloat> > M_

protected

Phonetic-subspace projections. Dimension is [I][D][S].

Definition at line 425 of file am-sgmm2.h.

Referenced by Sgmm2Project::ApplyProjection(), MleAmSgmm2Updater::ComputeMPrior(), MleAmSgmm2Updater::ComputeSMeans(), AmSgmm2::CopyFromSgmm2(), AmSgmm2::CopyGlobalsInitVecs(), MleAmSgmm2Updater::MapUpdateM(), MleAmSgmm2Updater::RenormalizeV(), EbwAmSgmm2Updater::UpdateM(), and MleAmSgmm2Updater::UpdateM().

◆ M_prior_

std::vector< Matrix<BaseFloat> > M_prior_

protected

Definition at line 449 of file am-sgmm2.h.

Referenced by MleAmSgmm2Updater::ComputeMPrior(), and MleAmSgmm2Updater::MapUpdateM().

◆ N_

std::vector< Matrix<BaseFloat> > N_

protected

Speaker-subspace projections. Dimension is [I][D][T].

Definition at line 427 of file am-sgmm2.h.

Referenced by Sgmm2Project::ApplyProjection(), AmSgmm2::CopyFromSgmm2(), AmSgmm2::CopyGlobalsInitVecs(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleAmSgmm2Updater::RenormalizeN(), EbwAmSgmm2Updater::UpdateN(), and MleAmSgmm2Updater::UpdateN().

◆ n_

std::vector< Matrix<BaseFloat> > n_

protected

n_{jim}, per-Gaussian normalizer. Dimension is [J1][I][#mix]

Definition at line 440 of file am-sgmm2.h.

Referenced by Sgmm2Project::ApplyProjection(), AmSgmm2::CopyFromSgmm2(), and MleAmSgmm2Updater::Update().

◆ pdf2group_

std::vector<int32> pdf2group_

protected

Definition at line 409 of file am-sgmm2.h.

Referenced by AmSgmm2::CopyFromSgmm2().

◆ row_cov_inv_

SpMatrix<BaseFloat> row_cov_inv_

protected

Definition at line 450 of file am-sgmm2.h.

Referenced by MleAmSgmm2Updater::ComputeMPrior(), and MleAmSgmm2Updater::MapUpdateM().

◆ SigmaInv_

std::vector< SpMatrix<BaseFloat> > SigmaInv_

protected

Globally shared parameters of the subspace GMM.

The various quantities are: I = number of Gaussians, D = data dimension, S = phonetic subspace dimension, T = speaker subspace dimension, J2 = number of pdfs, J1 = number of groups of pdfs (for SCTM), #mix = number of substates [of state j2 or state-group j1, depending on context]. Inverse within-class (full) covariances; dim is [I][D][D].

Definition at line 423 of file am-sgmm2.h.

Referenced by Sgmm2Project::ApplyProjection(), AmSgmm2::CopyFromSgmm2(), AmSgmm2::CopyGlobalsInitVecs(), MleAmSgmm2Updater::MapUpdateM(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), EbwAmSgmm2Updater::UpdateM(), MleAmSgmm2Updater::UpdateM(), EbwAmSgmm2Updater::UpdateN(), MleAmSgmm2Updater::UpdateN(), EbwAmSgmm2Updater::UpdateVars(), and MleAmSgmm2Updater::UpdateVars().

◆ u_

Matrix<BaseFloat> u_

protected

[SSGMM] Speaker-subspace weight projection vectors. Dimension is [I][T]

Definition at line 431 of file am-sgmm2.h.

Referenced by AmSgmm2::CopyFromSgmm2(), AmSgmm2::CopyGlobalsInitVecs(), EbwAmSgmm2Updater::UpdateU(), MleAmSgmm2Updater::UpdateU(), and MleSgmm2SpeakerAccs::UpdateWithU().

◆ v_

std::vector< Matrix<BaseFloat> > v_

protected

The parameters in a particular SGMM state.

v_{jm}, per-state phonetic-subspace vectors. Dimension is [J1][#mix][S].

Definition at line 436 of file am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), EbwAmSgmm2Updater::ComputePhoneVecStats(), MleAmSgmm2Updater::ComputeQ(), MleAmSgmm2Updater::ComputeSMeans(), AmSgmm2::CopyFromSgmm2(), MleAmSgmm2Updater::RenormalizeV(), EbwAmSgmm2Updater::UpdatePhoneVectorsInternal(), MleAmSgmm2Updater::UpdatePhoneVectorsInternal(), MleAmSgmm2Updater::UpdateW(), and MleAmSgmm2Updater::UpdateWGetStats().

◆ w_

Matrix<BaseFloat> w_

protected

Phonetic-subspace weight projection vectors. Dimension is [I][S].

Definition at line 429 of file am-sgmm2.h.

Referenced by EbwAmSgmm2Updater::ComputePhoneVecStats(), AmSgmm2::CopyFromSgmm2(), AmSgmm2::CopyGlobalsInitVecs(), MleAmSgmm2Updater::RenormalizeV(), EbwAmSgmm2Updater::UpdatePhoneVectorsInternal(), MleAmSgmm2Updater::UpdatePhoneVectorsInternal(), EbwAmSgmm2Updater::UpdateW(), and MleAmSgmm2Updater::UpdateW().

◆ w_jmi_

std::vector< Matrix<BaseFloat> > w_jmi_

protected

[SSGMM] w_{jmi}, dimension is [J1][#mix][I]. Computed from w_ and v_.

Definition at line 442 of file am-sgmm2.h.

Referenced by MleAmSgmm2Accs::AccumulateFromPosteriors(), MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), AmSgmm2::CopyFromSgmm2(), MleAmSgmm2Updater::Update(), and MleAmSgmm2Updater::UpdateW().

The documentation for this class was generated from the following files:

sgmm2/am-sgmm2.h
sgmm2/am-sgmm2.cc

Public Member Functions

Protected Attributes

Private Member Functions

Friends

Detailed Description

Constructor & Destructor Documentation

◆ AmSgmm2()

Member Function Documentation

◆ Check()

◆ ComponentLogLikes()

◆ ComponentPosteriors()

◆ ComputeDerivedVars()

◆ ComputeFmllrPreXform()

◆ ComputeGammaI()

◆ ComputeH()

◆ ComputeHsmFromModel()

◆ ComputeNormalizers()

◆ ComputeNormalizersInternal()

◆ ComputePdfMappings()

◆ ComputePerFrameVars()

◆ ComputePerSpkDerivedVars()

◆ ComputeWeights()

◆ CopyFromSgmm2()

◆ CopyGlobalsInitVecs()

◆ diag_ubm()

◆ FeatureDim()

◆ full_ubm()

◆ GaussianSelection()

◆ GetDjms()

◆ GetInvCovars()

◆ GetNtransSigmaInv()

◆ GetSubstateMean()

◆ GetSubstateSpeakerMean()

◆ GetVarScaledSubstateSpeakerMean()

◆ HasSpeakerDependentWeights()

◆ HasSpeakerSpace()

◆ IncreasePhoneSpaceDim()

◆ IncreaseSpkSpaceDim()

◆ InitializeCovars()

◆ InitializeFromFullGmm()

◆ InitializeMw()

◆ InitializeNu()

◆ InitializeVecsAndSubstateWeights()

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

◆ LogLikelihood()

◆ NumGauss()

◆ NumGroups()

◆ NumPdfs()

◆ NumSubstatesForGroup()

◆ NumSubstatesForPdf()

◆ Pdf2Group()

◆ PhoneSpaceDim()

◆ Read()

◆ RemoveSpeakerSpace()

◆ SpkSpaceDim()

◆ SplitSubstates()

◆ SplitSubstatesInGroup()

◆ Write()

Friends And Related Function Documentation

◆ AmSgmm2Functions

◆ ComputeNormalizersClass

◆ EbwAmSgmm2Updater

◆ MleAmSgmm2Accs

◆ MleAmSgmm2Updater

◆ MleSgmm2SpeakerAccs

◆ Sgmm2Feature

◆ Sgmm2Project

Member Data Documentation

◆ c_

◆ col_cov_inv_

◆ diag_ubm_

◆ full_ubm_

◆ group2pdf_

◆ M_

◆ M_prior_

◆ N_

◆ n_

◆ pdf2group_

◆ row_cov_inv_

◆ SigmaInv_