#include <ivector-extractor.h>

Collaboration diagram for IvectorExtractor:

[legend]

Public Member Functions
	IvectorExtractor ()

	IvectorExtractor (const IvectorExtractorOptions &opts, const FullGmm &fgmm)

void	GetIvectorDistribution (const IvectorExtractorUtteranceStats &utt_stats, VectorBase< double > mean, SpMatrix< double > var) const
	Gets the distribution over ivectors (or at least, a Gaussian approximation to it). More...

double	PriorOffset () const
	The distribution over iVectors, in our formulation, is not centered at zero; its first dimension has a nonzero offset. More...

double	GetAuxf (const IvectorExtractorUtteranceStats &utt_stats, const VectorBase< double > &mean, const SpMatrix< double > *var=NULL) const
	Returns the log-likelihood objective function, summed over frames, for this distribution of iVectors (a point distribution, if var == NULL). More...

double	GetAcousticAuxf (const IvectorExtractorUtteranceStats &utt_stats, const VectorBase< double > &mean, const SpMatrix< double > *var=NULL) const
	Returns the data-dependent part of the log-likelihood objective function, summed over frames. More...

double	GetPriorAuxf (const VectorBase< double > &mean, const SpMatrix< double > *var=NULL) const
	Returns the prior-related part of the log-likelihood objective function. More...

double	GetAcousticAuxfVariance (const IvectorExtractorUtteranceStats &utt_stats) const
	This returns just the part of the acoustic auxf that relates to the variance of the utt_stats (i.e. More...

double	GetAcousticAuxfMean (const IvectorExtractorUtteranceStats &utt_stats, const VectorBase< double > &mean, const SpMatrix< double > *var=NULL) const
	This returns just the part of the acoustic auxf that relates to the speaker-dependent means (and how they differ from the data means). More...

double	GetAcousticAuxfGconst (const IvectorExtractorUtteranceStats &utt_stats) const
	This returns the part of the acoustic auxf that relates to the gconsts of the Gaussians. More...

double	GetAcousticAuxfWeight (const IvectorExtractorUtteranceStats &utt_stats, const VectorBase< double > &mean, const SpMatrix< double > *var=NULL) const
	This returns the part of the acoustic auxf that relates to the Gaussian-specific weights. More...

void	GetIvectorDistMean (const IvectorExtractorUtteranceStats &utt_stats, VectorBase< double > linear, SpMatrix< double > quadratic) const
	Gets the linear and quadratic terms in the distribution over iVectors, but only the terms arising from the Gaussian means (i.e. More...

void	GetIvectorDistPrior (const IvectorExtractorUtteranceStats &utt_stats, VectorBase< double > linear, SpMatrix< double > quadratic) const
	Gets the linear and quadratic terms in the distribution over iVectors, that arise from the prior. More...

void	GetIvectorDistWeight (const IvectorExtractorUtteranceStats &utt_stats, const VectorBase< double > &mean, VectorBase< double > linear, SpMatrix< double > quadratic) const
	Gets the linear and quadratic terms in the distribution over iVectors, that arise from the weights (if applicable). More...

int32	FeatDim () const

int32	IvectorDim () const

int32	NumGauss () const

bool	IvectorDependentWeights () const

void	Write (std::ostream &os, bool binary) const

void	Read (std::istream &is, bool binary)

Protected Member Functions
void	ComputeDerivedVars ()

void	ComputeDerivedVars (int32 i)

void	TransformIvectors (const MatrixBase< double > &T, double new_prior_offset)

Protected Attributes
Matrix< double >	w_
	Weight projection vectors, if used. Dimension is [I][S]. More...

Vector< double >	w_vec_
	If we are not using weight-projection vectors, stores the Gaussian mixture weights from the UBM. More...

std::vector< Matrix< double > >	M_
	Ivector-subspace projection matrices, dimension is [I][D][S]. More...

std::vector< SpMatrix< double > >	Sigma_inv_
	Inverse variances of speaker-adapted model, dimension [I][D][D]. More...

double	prior_offset_
	1st dim of the prior over the ivector has an offset, so it is not zero. More...

Vector< double >	gconsts_
	The constant term in the log-likelihood of each Gaussian (not counting any weight). More...

Matrix< double >	U_
	U_i = M_i^T ^{-1} M_i is a quantity that comes up in ivector estimation. More...

std::vector< Matrix< double > >	Sigma_inv_M_
	The product of Sigma_inv_[i] with M_[i]. More...

Static Private Member Functions
static void	InvertWithFlooring (const SpMatrix< double > &quadratic_term, SpMatrix< double > *var)

Friends
class	IvectorExtractorStats

class	OnlineIvectorEstimationStats

class	IvectorExtractorComputeDerivedVarsClass

Detailed Description

Definition at line 136 of file ivector-extractor.h.

Constructor & Destructor Documentation

◆ IvectorExtractor() [1/2]

IvectorExtractor ( )

inline

Definition at line 141 of file ivector-extractor.h.

141 : prior_offset_(0.0) { }

kaldi::IvectorExtractor::prior_offset_

double prior_offset_

1st dim of the prior over the ivector has an offset, so it is not zero.

Definition: ivector-extractor.h:284

◆ IvectorExtractor() [2/2]

IvectorExtractor	(	const IvectorExtractorOptions &	opts,
		const FullGmm &	fgmm
	)

Definition at line 137 of file ivector-extractor.cc.

References IvectorExtractor::ComputeDerivedVars(), VectorBase< Real >::CopyFromVec(), FullGmm::GetMeans(), rnnlm::i, FullGmm::inv_covars(), IvectorExtractorOptions::ivector_dim, KALDI_ASSERT, IvectorExtractor::M_, FullGmm::NumGauss(), PackedMatrix< Real >::NumRows(), IvectorExtractor::prior_offset_, Vector< Real >::Resize(), Matrix< Real >::Resize(), MatrixBase< Real >::Row(), MatrixBase< Real >::Scale(), IvectorExtractor::Sigma_inv_, IvectorExtractorOptions::use_weights, IvectorExtractor::w_, IvectorExtractor::w_vec_, and FullGmm::weights().

                          {
   KALDI_ASSERT(opts.ivector_dim > 0);
   Sigma_inv_.resize(fgmm.NumGauss());
   for (int32 i = 0; i < fgmm.NumGauss(); i++) {
     const SpMatrix<BaseFloat> &inv_var = fgmm.inv_covars()[i];
     Sigma_inv_[i].Resize(inv_var.NumRows());
     Sigma_inv_[i].CopyFromSp(inv_var);
   }
   Matrix<double> gmm_means;
   fgmm.GetMeans(&gmm_means);
   KALDI_ASSERT(!Sigma_inv_.empty());
   int32 feature_dim = Sigma_inv_[0].NumRows(),
       num_gauss = Sigma_inv_.size();
 
   prior_offset_ = 100.0; // hardwired for now.  Must be nonzero.
   gmm_means.Scale(1.0 / prior_offset_);
 
   M_.resize(num_gauss);
   for (int32 i = 0; i < num_gauss; i++) {
     M_[i].Resize(feature_dim, opts.ivector_dim);
     M_[i].SetRandn();
     M_[i].CopyColFromVec(gmm_means.Row(i), 0);
   }
   if (opts.use_weights) { // will regress the log-weights on the iVector.
     w_.Resize(num_gauss, opts.ivector_dim);
   } else {
     w_vec_.Resize(fgmm.NumGauss());
     w_vec_.CopyFromVec(fgmm.weights());
   }
   ComputeDerivedVars();
 }

Member Function Documentation

◆ ComputeDerivedVars() [1/2]

void ComputeDerivedVars ( )

protected

Definition at line 182 of file ivector-extractor.cc.

References IvectorExtractor::FeatDim(), kaldi::g_num_threads, IvectorExtractor::gconsts_, rnnlm::i, IvectorExtractor::IvectorDim(), KALDI_LOG, M_LOG_2PI, TaskSequencerConfig::num_threads, IvectorExtractor::NumGauss(), Vector< Real >::Resize(), Matrix< Real >::Resize(), TaskSequencer< C >::Run(), IvectorExtractor::Sigma_inv_, IvectorExtractor::Sigma_inv_M_, and IvectorExtractor::U_.

Referenced by IvectorExtractorStats::GetOrthogonalIvectorTransform(), IvectorExtractor::IvectorExtractor(), and IvectorExtractorStats::Update().

                                           {
   KALDI_LOG << "Computing derived variables for iVector extractor";
   gconsts_.Resize(NumGauss());
   for (int32 i = 0; i < NumGauss(); i++) {
     double var_logdet = -Sigma_inv_[i].LogPosDefDet();
     gconsts_(i) = -0.5 * (var_logdet + FeatDim() * M_LOG_2PI);
     // the gconsts don't contain any weight-related terms.
   }
   U_.Resize(NumGauss(), IvectorDim() * (IvectorDim() + 1) / 2);
   Sigma_inv_M_.resize(NumGauss());
 
   // Note, we could have used RunMultiThreaded for this and similar tasks we
   // have here, but we found that we don't get as complete CPU utilization as we
   // could because some tasks finish before others.
   {
     TaskSequencerConfig sequencer_opts;
     sequencer_opts.num_threads = g_num_threads;
     TaskSequencer<IvectorExtractorComputeDerivedVarsClass> sequencer(
         sequencer_opts);
     for (int32 i = 0; i < NumGauss(); i++)
       sequencer.Run(new IvectorExtractorComputeDerivedVarsClass(this, i));
   }
   KALDI_LOG << "Done.";
 }

◆ ComputeDerivedVars() [2/2]

void ComputeDerivedVars ( int32 i )

protected

Definition at line 208 of file ivector-extractor.cc.

References SpMatrix< Real >::AddMat2Sp(), PackedMatrix< Real >::Data(), IvectorExtractor::FeatDim(), rnnlm::i, IvectorExtractor::IvectorDim(), kaldi::kNoTrans, kaldi::kTrans, IvectorExtractor::M_, MatrixBase< Real >::Row(), IvectorExtractor::Sigma_inv_, IvectorExtractor::Sigma_inv_M_, and IvectorExtractor::U_.

                                                  {
   SpMatrix<double> temp_U(IvectorDim());
   // temp_U = M_i^T Sigma_i^{-1} M_i
   temp_U.AddMat2Sp(1.0, M_[i], kTrans, Sigma_inv_[i], 0.0);
   SubVector<double> temp_U_vec(temp_U.Data(),
                                IvectorDim() * (IvectorDim() + 1) / 2);
   U_.Row(i).CopyFromVec(temp_U_vec);
 
   Sigma_inv_M_[i].Resize(FeatDim(), IvectorDim());
   Sigma_inv_M_[i].AddSpMat(1.0, Sigma_inv_[i], M_[i], kNoTrans, 0.0);
 }

◆ FeatDim()

int32 FeatDim ( ) const

Definition at line 28 of file ivector-extractor.cc.

References KALDI_ASSERT, and IvectorExtractor::M_.

Referenced by IvectorExtractorStats::AccStatsForUtterance(), OnlineIvectorExtractionInfo::Check(), IvectorExtractorStats::CheckDims(), IvectorExtractor::ComputeDerivedVars(), IvectorExtractor::GetAcousticAuxfMean(), IvectorExtractor::GetAcousticAuxfVariance(), IvectorExtractorStats::IvectorExtractorStats(), IvectorExtractorStats::IvectorVarianceDiagnostic(), IvectorExtractTask::operator()(), kaldi::RunPerSpeaker(), and IvectorExtractorStats::UpdateVariances().

                                       {
   KALDI_ASSERT(!M_.empty());
   return M_[0].NumRows();
 }

◆ GetAcousticAuxf()

double GetAcousticAuxf	(	const IvectorExtractorUtteranceStats &	utt_stats,
		const VectorBase< double > &	mean,
		const SpMatrix< double > *	var = `NULL`
	)		const

Returns the data-dependent part of the log-likelihood objective function, summed over frames.

If variance pointer is NULL, uses point value.

Definition at line 419 of file ivector-extractor.cc.

References IvectorExtractorUtteranceStats::gamma_, IvectorExtractor::GetAcousticAuxfGconst(), IvectorExtractor::GetAcousticAuxfMean(), IvectorExtractor::GetAcousticAuxfVariance(), IvectorExtractor::GetAcousticAuxfWeight(), KALDI_VLOG, and VectorBase< Real >::Sum().

Referenced by IvectorExtractor::GetAuxf().

                                        {
   double weight_auxf = GetAcousticAuxfWeight(utt_stats, mean, var),
       gconst_auxf = GetAcousticAuxfGconst(utt_stats),
       mean_auxf = GetAcousticAuxfMean(utt_stats, mean, var),
       var_auxf = GetAcousticAuxfVariance(utt_stats),
       T = utt_stats.gamma_.Sum();
   KALDI_VLOG(3) << "Per frame, auxf is: weight " << (weight_auxf/T) << ", gconst "
                 << (gconst_auxf/T) << ", mean " << (mean_auxf/T) << ", var "
                 << (var_auxf/T) << ", over " << T << " frames.";
   return weight_auxf + gconst_auxf + mean_auxf + var_auxf;
 }

◆ GetAcousticAuxfGconst()

double GetAcousticAuxfGconst ( const IvectorExtractorUtteranceStats & utt_stats ) const

This returns the part of the acoustic auxf that relates to the gconsts of the Gaussians.

Definition at line 490 of file ivector-extractor.cc.

References IvectorExtractorUtteranceStats::gamma_, IvectorExtractor::gconsts_, and kaldi::VecVec().

Referenced by IvectorExtractor::GetAcousticAuxf().

                                                            {
   return VecVec(Vector<double>(utt_stats.gamma_),
                 gconsts_);
 }

◆ GetAcousticAuxfMean()

double GetAcousticAuxfMean	(	const IvectorExtractorUtteranceStats &	utt_stats,
		const VectorBase< double > &	mean,
		const SpMatrix< double > *	var = `NULL`
	)		const

This returns just the part of the acoustic auxf that relates to the speaker-dependent means (and how they differ from the data means).

Definition at line 460 of file ivector-extractor.cc.

References VectorBase< Real >::AddMatVec(), VectorBase< Real >::AddSpVec(), PackedMatrix< Real >::Data(), IvectorExtractor::FeatDim(), IvectorExtractorUtteranceStats::gamma_, rnnlm::i, IvectorExtractor::IvectorDim(), kaldi::kTrans, IvectorExtractor::M_, IvectorExtractor::NumGauss(), MatrixBase< Real >::Row(), IvectorExtractor::Sigma_inv_, kaldi::TraceSpSp(), IvectorExtractor::U_, kaldi::VecSpVec(), kaldi::VecVec(), and IvectorExtractorUtteranceStats::X_.

Referenced by IvectorExtractor::GetAcousticAuxf().

                                        {
   double K = 0.0;
   Vector<double> a(IvectorDim()), temp(FeatDim());
 
   int32 I = NumGauss();
   for (int32 i = 0; i < I; i++) {
     double gamma = utt_stats.gamma_(i);
     if (gamma != 0.0) {
       Vector<double> x(utt_stats.X_.Row(i)); // == \gamma(i) \m_i
       temp.AddSpVec(1.0 / gamma, Sigma_inv_[i], x, 0.0);
       // now temp = Sigma_i^{-1} \m_i.
       // next line: K += -0.5 \gamma_i \m_i^T \Sigma_i^{-1} \m_i
       K += -0.5 * VecVec(x, temp);
       // next line: a += \gamma_i \M_i^T \Sigma_i^{-1} \m_i
       a.AddMatVec(gamma, M_[i], kTrans, temp, 1.0);
     }
   }
   SpMatrix<double> B(IvectorDim());
   SubVector<double> B_vec(B.Data(), IvectorDim()*(IvectorDim()+1)/2);
   B_vec.AddMatVec(1.0, U_, kTrans, Vector<double>(utt_stats.gamma_), 0.0);
 
   double ans = K + VecVec(mean, a) - 0.5 * VecSpVec(mean, B, mean);
   if (var != NULL)
     ans -= 0.5 * TraceSpSp(*var, B);
   return ans;
 }

◆ GetAcousticAuxfVariance()

double GetAcousticAuxfVariance ( const IvectorExtractorUtteranceStats & utt_stats ) const

This returns just the part of the acoustic auxf that relates to the variance of the utt_stats (i.e.

which would be zero if the utt_stats had zero variance. This does not depend on the iVector, it's included as an aid to debugging. We can only get this if we stored the S statistics. If not we assume the variance is generated from the model.

Definition at line 497 of file ivector-extractor.cc.

References SpMatrix< Real >::AddVec2(), IvectorExtractor::FeatDim(), IvectorExtractorUtteranceStats::gamma_, rnnlm::i, IvectorExtractor::NumGauss(), MatrixBase< Real >::Row(), IvectorExtractorUtteranceStats::S_, PackedMatrix< Real >::Scale(), VectorBase< Real >::Scale(), IvectorExtractor::Sigma_inv_, VectorBase< Real >::Sum(), kaldi::TraceSpSp(), and IvectorExtractorUtteranceStats::X_.

Referenced by IvectorExtractor::GetAcousticAuxf().

                                                            {
   if (utt_stats.S_.empty()) {
     // we did not store the variance, so assume it's as predicted
     // by the model itself.
     // for each Gaussian i, we have a term -0.5 * gamma(i) * trace(Sigma[i] * Sigma[i]^{-1})
     //   = -0.5 * gamma(i) * FeatDim().
     return -0.5 * utt_stats.gamma_.Sum() * FeatDim();
   } else {
     int32 I = NumGauss();
     double ans = 0.0;
     for (int32 i = 0; i < I; i++) {
       double gamma = utt_stats.gamma_(i);
       if (gamma != 0.0) {
         SpMatrix<double> var(utt_stats.S_[i]);
         var.Scale(1.0 / gamma);
         Vector<double> mean(utt_stats.X_.Row(i));
         mean.Scale(1.0 / gamma);
         var.AddVec2(-1.0, mean); // get centered covariance..
         ans += -0.5 * gamma * TraceSpSp(var, Sigma_inv_[i]);
       }
     }
     return ans;
   }
 }

◆ GetAcousticAuxfWeight()

double GetAcousticAuxfWeight	(	const IvectorExtractorUtteranceStats &	utt_stats,
		const VectorBase< double > &	mean,
		const SpMatrix< double > *	var = `NULL`
	)		const

This returns the part of the acoustic auxf that relates to the Gaussian-specific weights.

(impacted by the iVector only if we are using w_).

Definition at line 285 of file ivector-extractor.cc.

References VectorBase< Real >::Add(), SpMatrix< Real >::AddMat2Vec(), VectorBase< Real >::AddMatVec(), MatrixBase< Real >::AddVecVec(), VectorBase< Real >::ApplyExp(), VectorBase< Real >::ApplyLog(), IvectorExtractorUtteranceStats::gamma_, IvectorExtractor::IvectorDependentWeights(), IvectorExtractor::IvectorDim(), kaldi::kNoTrans, kaldi::kTrans, VectorBase< Real >::LogSumExp(), IvectorExtractor::NumGauss(), kaldi::TraceSpSp(), kaldi::VecVec(), IvectorExtractor::w_, and IvectorExtractor::w_vec_.

Referenced by IvectorExtractor::GetAcousticAuxf().

                                        {
   if (!IvectorDependentWeights()) { // Not using the weight-projection matrices.
     Vector<double> log_w_vec(w_vec_);
     log_w_vec.ApplyLog();
     return VecVec(log_w_vec, utt_stats.gamma_);
   } else {
     Vector<double> w(NumGauss());
     w.AddMatVec(1.0, w_, kNoTrans, mean, 0.0);  // now w is unnormalized
     // log-weights.
 
     double lse = w.LogSumExp();
     w.Add(-lse); // Normalize so log-weights sum to one.
 
     // "ans" below is the point-value of the weight auxf, without
     // considering the variance.  At the moment, "w" contains
     // the normalized log weights.
     double ans = VecVec(w, utt_stats.gamma_);
 
     w.ApplyExp(); // now w is the weights.
 
     if (var == NULL) {
       return ans;
     } else {
       // Below, "Jacobian" will be the derivative d(log_w) / d(ivector)
       // = (I - w w^T) W, where W (w_ in the code) is the projection matrix
       // from iVector space to unnormalized log-weights, and w is the normalized
       // weight values at the current point.
       Matrix<double> Jacobian(w_);
       Vector<double> WTw(IvectorDim()); // W^T w
       WTw.AddMatVec(1.0, w_, kTrans, w, 0.0);
       Jacobian.AddVecVec(1.0, w, WTw); // Jacobian += (w (W^T w)^T = w^T w W)
 
       // the matrix S is the negated 2nd derivative of the objf w.r.t. the iVector \x.
       SpMatrix<double> S(IvectorDim());
       S.AddMat2Vec(1.0, Jacobian, kTrans, Vector<double>(utt_stats.gamma_), 0.0);
       ans += -0.5 * TraceSpSp(S, *var);
       return ans;
     }
   }
 }

◆ GetAuxf()

double GetAuxf	(	const IvectorExtractorUtteranceStats &	utt_stats,
		const VectorBase< double > &	mean,
		const SpMatrix< double > *	var = `NULL`
	)		const

Returns the log-likelihood objective function, summed over frames, for this distribution of iVectors (a point distribution, if var == NULL).

Definition at line 331 of file ivector-extractor.cc.

References IvectorExtractorUtteranceStats::gamma_, IvectorExtractor::GetAcousticAuxf(), IvectorExtractor::GetPriorAuxf(), KALDI_VLOG, and VectorBase< Real >::Sum().

Referenced by IvectorExtractorStats::CommitStatsForUtterance(), IvectorExtractor::GetIvectorDistribution(), IvectorExtractTask::operator()(), kaldi::RunPerSpeaker(), and kaldi::TestIvectorExtraction().

                                                                     {
 
   double acoustic_auxf = GetAcousticAuxf(utt_stats, mean, var),
       prior_auxf = GetPriorAuxf(mean, var), num_frames = utt_stats.gamma_.Sum();
   KALDI_VLOG(3) << "Acoustic auxf is " << (acoustic_auxf/num_frames) << "/frame over "
                 << num_frames << " frames, prior auxf is " << prior_auxf
                 << " = " << (prior_auxf/num_frames) << " per frame.";
   return acoustic_auxf + prior_auxf;
 }

◆ GetIvectorDistMean()

void GetIvectorDistMean	(	const IvectorExtractorUtteranceStats &	utt_stats,
		VectorBase< double > *	linear,
		SpMatrix< double > *	quadratic
	)		const

Gets the linear and quadratic terms in the distribution over iVectors, but only the terms arising from the Gaussian means (i.e.

not the weights or the priors). Setup is log p(x) x^T linear -0.5 x^T quadratic x. This function *adds to* the output rather than setting it.

Definition at line 257 of file ivector-extractor.cc.

References VectorBase< Real >::AddMatVec(), PackedMatrix< Real >::Data(), IvectorExtractorUtteranceStats::gamma_, rnnlm::i, IvectorExtractor::IvectorDim(), kaldi::kTrans, IvectorExtractor::NumGauss(), IvectorExtractor::Sigma_inv_M_, IvectorExtractor::U_, and IvectorExtractorUtteranceStats::X_.

Referenced by IvectorExtractor::GetIvectorDistribution().

                                        {
   int32 I = NumGauss();
   for (int32 i = 0; i < I; i++) {
     double gamma = utt_stats.gamma_(i);
     if (gamma != 0.0) {
       SubVector<double> x(utt_stats.X_, i); // == \gamma(i) \m_i
       // next line: a += \gamma_i \M_i^T \Sigma_i^{-1} \m_i
       linear->AddMatVec(1.0, Sigma_inv_M_[i], kTrans, x, 1.0);
     }
   }
   SubVector<double> q_vec(quadratic->Data(), IvectorDim()*(IvectorDim()+1)/2);
   q_vec.AddMatVec(1.0, U_, kTrans, utt_stats.gamma_, 1.0);
 }

◆ GetIvectorDistPrior()

void GetIvectorDistPrior	(	const IvectorExtractorUtteranceStats &	utt_stats,
		VectorBase< double > *	linear,
		SpMatrix< double > *	quadratic
	)		const

Gets the linear and quadratic terms in the distribution over iVectors, that arise from the prior.

Adds to the outputs, rather than setting them.

The inverse-variance for the prior is the unit matrix.

Definition at line 274 of file ivector-extractor.cc.

References PackedMatrix< Real >::AddToDiag(), and IvectorExtractor::prior_offset_.

Referenced by IvectorExtractor::GetIvectorDistribution().

                                        {
 
   (*linear)(0) += prior_offset_; // the zero'th dimension has an offset mean.
   quadratic->AddToDiag(1.0);
 }

◆ GetIvectorDistribution()

void GetIvectorDistribution	(	const IvectorExtractorUtteranceStats &	utt_stats,
		VectorBase< double > *	mean,
		SpMatrix< double > *	var
	)		const

Gets the distribution over ivectors (or at least, a Gaussian approximation to it).

The output "var" may be NULL if you don't need it. "mean", and "var", if present, must be the correct dimension (this->IvectorDim()). If you only need a point estimate of the iVector, get the mean only.

Definition at line 63 of file ivector-extractor.cc.

References VectorBase< Real >::AddSpVec(), VectorBase< Real >::AddVec(), SpMatrix< Real >::Cond(), SpMatrix< Real >::CopyFromSp(), VectorBase< Real >::CopyFromVec(), VectorBase< Real >::Dim(), IvectorExtractor::GetAuxf(), IvectorExtractor::GetIvectorDistMean(), IvectorExtractor::GetIvectorDistPrior(), IvectorExtractor::GetIvectorDistWeight(), kaldi::GetVerboseLevel(), SpMatrix< Real >::Invert(), IvectorExtractor::InvertWithFlooring(), IvectorExtractor::IvectorDependentWeights(), IvectorExtractor::IvectorDim(), KALDI_VLOG, VectorBase< Real >::Norm(), VectorBase< Real >::Range(), and SpMatrix< Real >::Trace().

Referenced by IvectorExtractorStats::CommitStatsForUtterance(), IvectorExtractTask::operator()(), kaldi::RunPerSpeaker(), and kaldi::TestIvectorExtraction().

                                  {
   if (!IvectorDependentWeights()) {
     Vector<double> linear(IvectorDim());
     SpMatrix<double> quadratic(IvectorDim());
     GetIvectorDistMean(utt_stats, &linear, &quadratic);
     GetIvectorDistPrior(utt_stats, &linear, &quadratic);
     if (var != NULL) {
       var->CopyFromSp(quadratic);
       var->Invert(); // now it's a variance.
 
       // mean of distribution = quadratic^{-1} * linear...
       mean->AddSpVec(1.0, *var, linear, 0.0);
     } else {
       quadratic.Invert();
       mean->AddSpVec(1.0, quadratic, linear, 0.0);
     }
   } else {
     Vector<double> linear(IvectorDim());
     SpMatrix<double> quadratic(IvectorDim());
     GetIvectorDistMean(utt_stats, &linear, &quadratic);
     GetIvectorDistPrior(utt_stats, &linear, &quadratic);
     // At this point, "linear" and "quadratic" contain
     // the mean and prior-related terms, and we avoid
     // recomputing those.
 
     Vector<double> cur_mean(IvectorDim());
 
     SpMatrix<double> quadratic_inv(IvectorDim());
     InvertWithFlooring(quadratic, &quadratic_inv);
     cur_mean.AddSpVec(1.0, quadratic_inv, linear, 0.0);
 
     KALDI_VLOG(3) << "Trace of quadratic is " << quadratic.Trace()
                   << ", condition is " << quadratic.Cond();
     KALDI_VLOG(3) << "Trace of quadratic_inv is " << quadratic_inv.Trace()
                   << ", condition is " << quadratic_inv.Cond();
 
     // The loop is finding successively better approximation points
     // for the quadratic expansion of the weights.
     int32 num_iters = 4;
     double change_threshold = 0.1; // If the iVector changes by less than
     // this (in 2-norm), we abort early.
     for (int32 iter = 0; iter < num_iters; iter++) {
       if (GetVerboseLevel() >= 3) {
         KALDI_VLOG(3) << "Auxf on iter " << iter << " is "
                       << GetAuxf(utt_stats, cur_mean, &quadratic_inv);
         int32 show_dim = 5;
         if (show_dim > cur_mean.Dim()) show_dim = cur_mean.Dim();
         KALDI_VLOG(3) << "Current distribution mean is "
                       << cur_mean.Range(0, show_dim) << "... "
                       << ", var trace is " << quadratic_inv.Trace();
       }
       Vector<double> this_linear(linear);
       SpMatrix<double> this_quadratic(quadratic);
       GetIvectorDistWeight(utt_stats, cur_mean,
                            &this_linear, &this_quadratic);
       InvertWithFlooring(this_quadratic, &quadratic_inv);
       Vector<double> mean_diff(cur_mean);
       cur_mean.AddSpVec(1.0, quadratic_inv, this_linear, 0.0);
       mean_diff.AddVec(-1.0, cur_mean);
       double change = mean_diff.Norm(2.0);
       KALDI_VLOG(2) << "On iter " << iter << ", iVector changed by " << change;
       if (change < change_threshold)
         break;
     }
     mean->CopyFromVec(cur_mean);
     if (var != NULL)
       var->CopyFromSp(quadratic_inv);
   }
 }

◆ GetIvectorDistWeight()

void GetIvectorDistWeight	(	const IvectorExtractorUtteranceStats &	utt_stats,
		const VectorBase< double > &	mean,
		VectorBase< double > *	linear,
		SpMatrix< double > *	quadratic
	)		const

Gets the linear and quadratic terms in the distribution over iVectors, that arise from the weights (if applicable).

The "mean" parameter is the iVector point that we compute the expansion around (it's a quadratic approximation of a nonlinear function, but with a "safety factor" (the "max" stuff). Adds to the outputs, rather than setting them.

Definition at line 221 of file ivector-extractor.cc.

References SpMatrix< Real >::AddMat2Vec(), VectorBase< Real >::AddMatVec(), VectorBase< Real >::ApplySoftMax(), IvectorExtractorUtteranceStats::gamma_, rnnlm::i, IvectorExtractor::IvectorDependentWeights(), kaldi::kNoTrans, kaldi::kTrans, IvectorExtractor::NumGauss(), VectorBase< Real >::Sum(), and IvectorExtractor::w_.

Referenced by IvectorExtractor::GetIvectorDistribution().

                                        {
   // If there is no w_, then weights do not depend on the iVector
   // and the weights contribute nothing to the distribution.
   if (!IvectorDependentWeights())
     return;
 
   Vector<double> logw_unnorm(NumGauss());
   logw_unnorm.AddMatVec(1.0, w_, kNoTrans, mean, 0.0);
 
   Vector<double> w(logw_unnorm);
   w.ApplySoftMax(); // now w is the weights.
 
   // See eq.58 in SGMM paper
   // http://www.sciencedirect.com/science/article/pii/S088523081000063X
   // linear_coeff(i) = \gamma_{jmi} - \gamma_{jm} \hat{w}_{jmi} + \max(\gamma_{jmi}, \gamma_{jm} \hat{w}_{jmi} \hat{\w}_i \v_{jm}
   // here \v_{jm} corresponds to the iVector.  Ignore the j,m indices.
   Vector<double> linear_coeff(NumGauss());
   Vector<double> quadratic_coeff(NumGauss());
   double gamma = utt_stats.gamma_.Sum();
   for (int32 i = 0; i < NumGauss(); i++) {
     double gamma_i = utt_stats.gamma_(i);
     double max_term = std::max(gamma_i, gamma * w(i));
     linear_coeff(i) = gamma_i - gamma * w(i) + max_term * logw_unnorm(i);
     quadratic_coeff(i) = max_term;
   }
   linear->AddMatVec(1.0, w_, kTrans, linear_coeff, 1.0);
 
   // *quadratic += \sum_i quadratic_coeff(i) w_i w_i^T, where w_i is
   //    i'th row of w_.
   quadratic->AddMat2Vec(1.0, w_, kTrans, quadratic_coeff, 1.0);
 }

◆ GetPriorAuxf()

double GetPriorAuxf	(	const VectorBase< double > &	mean,
		const SpMatrix< double > *	var = `NULL`
	)		const

Returns the prior-related part of the log-likelihood objective function.

Note: if var != NULL, this quantity is a *probability*, otherwise it is a likelihood (and the corresponding probability is zero).

Definition at line 381 of file ivector-extractor.cc.

References VectorBase< Real >::Dim(), kaldi::GetLogDetNoFailure(), IvectorExtractor::IvectorDim(), KALDI_ASSERT, M_LOG_2PI, PackedMatrix< Real >::NumRows(), IvectorExtractor::prior_offset_, SpMatrix< Real >::Trace(), and kaldi::VecVec().

Referenced by IvectorExtractor::GetAuxf().

                                        {
   KALDI_ASSERT(mean.Dim() == IvectorDim());
 
   Vector<double> offset(mean);
   offset(0) -= prior_offset_; // The mean of the prior distribution
   // may only be nonzero in the first dimension.  Now, "offset" is the
   // offset of ivector from the prior's mean.
 
 
   if (var == NULL) {
     // The log-determinant of the variance of the prior distribution is one,
     // since it's the unit matrix.
     return -0.5 * (VecVec(offset, offset) + IvectorDim()*M_LOG_2PI);
   } else {
     // The mean-related part of the answer will be
     // -0.5 * (VecVec(offset, offset), just like above.
     // The variance-related part will be
     //  \int p(x) . -0.5 (x^T I x - x^T var^{-1} x  + logdet(I) - logdet(var))   dx
     // and using the fact that x is distributed with variance "var", this is:
     //= \int p(x) . -0.5 (x^T I x - x^T var^{-1} x  + logdet(I) - logdet(var))   dx
     // = -0.5 ( trace(var I) - trace(var^{-1} var) + 0.0 - logdet(var))
     // = -0.5 ( trace(var) - dim(var) - logdet(var))
 
     KALDI_ASSERT(var->NumRows() == IvectorDim());
     return -0.5 * (VecVec(offset, offset) + var->Trace() -
                    IvectorDim() - GetLogDetNoFailure(*var));
   }
 }

◆ InvertWithFlooring()

void InvertWithFlooring	(	const SpMatrix< double > &	quadratic_term,
		SpMatrix< double > *	var
	)

staticprivate

Definition at line 49 of file ivector-extractor.cc.

References SpMatrix< Real >::AddMat2Vec(), VectorBase< Real >::ApplyFloor(), SpMatrix< Real >::Eig(), VectorBase< Real >::InvertElements(), kaldi::kNoTrans, and PackedMatrix< Real >::NumRows().

Referenced by IvectorExtractor::GetIvectorDistribution(), and IvectorExtractorStats::IvectorVarianceDiagnostic().

                                                                  {
   SpMatrix<double> dbl_var(inverse_var);
   int32 dim = inverse_var.NumRows();
   Vector<double> s(dim);
   Matrix<double> P(dim, dim);
   // Solve the symmetric eigenvalue problem, inverse_var = P diag(s) P^T.
   inverse_var.Eig(&s, &P);
   s.ApplyFloor(1.0);
   s.InvertElements();
   var->AddMat2Vec(1.0, P, kNoTrans, s, 0.0);
 }

◆ IvectorDependentWeights()

bool IvectorDependentWeights ( ) const

inline

Definition at line 245 of file ivector-extractor.h.

References rnnlm::i.

Referenced by OnlineIvectorEstimationStats::AccStats(), IvectorExtractorStats::CheckDims(), IvectorExtractorStats::CommitStatsForUtterance(), IvectorExtractor::GetAcousticAuxfWeight(), IvectorExtractor::GetIvectorDistribution(), IvectorExtractor::GetIvectorDistWeight(), IvectorExtractorStats::GetOrthogonalIvectorTransform(), IvectorExtractorStats::IvectorExtractorStats(), kaldi::TestIvectorExtraction(), IvectorExtractor::TransformIvectors(), and IvectorExtractorStats::Update().

245 { return w_.NumRows() != 0; }

kaldi::MatrixBase::NumRows

MatrixIndexT NumRows() const

Returns number of rows (or zero for empty matrix).

Definition: kaldi-matrix.h:64

kaldi::IvectorExtractor::w_

Matrix< double > w_

Weight projection vectors, if used. Dimension is [I][S].

Definition: ivector-extractor.h:264

◆ IvectorDim()

int32 IvectorDim ( ) const

Definition at line 33 of file ivector-extractor.cc.

References IvectorExtractor::M_.

                                          {
   if (M_.empty()) { return 0.0; }
   else { return M_[0].NumCols(); }
 }

◆ NumGauss()

int32 NumGauss ( ) const

Definition at line 38 of file ivector-extractor.cc.

References IvectorExtractor::M_.

                                        {
   return static_cast<int32>(M_.size());
 }

◆ PriorOffset()

double PriorOffset ( ) const

inline

The distribution over iVectors, in our formulation, is not centered at zero; its first dimension has a nonzero offset.

This function returns that offset.

Definition at line 159 of file ivector-extractor.h.

Referenced by kaldi::EstimateIvectorsOnline(), main(), OnlineIvectorFeature::OnlineIvectorFeature(), IvectorExtractTask::operator()(), kaldi::RunPerSpeaker(), kaldi::TestIvectorExtraction(), and IvectorExtractTask::~IvectorExtractTask().

159 { return prior_offset_; }

kaldi::IvectorExtractor::prior_offset_

double prior_offset_

1st dim of the prior over the ivector has an offset, so it is not zero.

Definition: ivector-extractor.h:284

◆ Read()

void Read	(	std::istream &	is,
		bool	binary
	)

Definition at line 828 of file ivector-extractor.cc.

References kaldi::ExpectToken(), rnnlm::i, KALDI_ASSERT, OnlineIvectorEstimationStats::prior_offset_, OnlineIvectorEstimationStats::Read(), and kaldi::ReadBasicType().

Referenced by kaldi::TestIvectorExtractorIO().

                                                        {
   ExpectToken(is, binary, "<IvectorExtractor>");
   ExpectToken(is, binary, "<w>");
   w_.Read(is, binary);
   ExpectToken(is, binary, "<w_vec>");
   w_vec_.Read(is, binary);
   ExpectToken(is, binary, "<M>");
   int32 size;
   ReadBasicType(is, binary, &size);
   KALDI_ASSERT(size > 0);
   M_.resize(size);
   for (int32 i = 0; i < size; i++)
     M_[i].Read(is, binary);
   ExpectToken(is, binary, "<SigmaInv>");
   Sigma_inv_.resize(size);
   for (int32 i = 0; i < size; i++)
     Sigma_inv_[i].Read(is, binary);
   ExpectToken(is, binary, "<IvectorOffset>");
   ReadBasicType(is, binary, &prior_offset_);
   ExpectToken(is, binary, "</IvectorExtractor>");
   ComputeDerivedVars();
 }

◆ TransformIvectors()

void TransformIvectors	(	const MatrixBase< double > &	T,
		double	new_prior_offset
	)

protected

Definition at line 523 of file ivector-extractor.cc.

References MatrixBase< Real >::AddMatMat(), rnnlm::i, MatrixBase< Real >::Invert(), IvectorExtractor::IvectorDependentWeights(), KALDI_LOG, kaldi::kNoTrans, IvectorExtractor::M_, IvectorExtractor::NumGauss(), IvectorExtractor::prior_offset_, and IvectorExtractor::w_.

Referenced by IvectorExtractorStats::UpdatePrior().

                                                                   {
   Matrix<double> Tinv(T);
   Tinv.Invert();
   // w <-- w Tinv.  (construct temporary copy with Matrix<double>(w))
   if (IvectorDependentWeights())
     w_.AddMatMat(1.0, Matrix<double>(w_), kNoTrans, Tinv, kNoTrans, 0.0);
   // next: M_i <-- M_i Tinv.  (construct temporary copy with Matrix<double>(M_[i]))
   for (int32 i = 0; i < NumGauss(); i++)
     M_[i].AddMatMat(1.0, Matrix<double>(M_[i]), kNoTrans, Tinv, kNoTrans, 0.0);
   KALDI_LOG << "Setting iVector prior offset to " << new_prior_offset;
   prior_offset_ = new_prior_offset;
 }

◆ Write()

void Write	(	std::ostream &	os,
		bool	binary
	)		const

Definition at line 807 of file ivector-extractor.cc.

References rnnlm::i, KALDI_ASSERT, OnlineIvectorEstimationStats::prior_offset_, OnlineIvectorEstimationStats::Write(), kaldi::WriteBasicType(), and kaldi::WriteToken().

Referenced by kaldi::TestIvectorExtractorIO().

                                                               {
   WriteToken(os, binary, "<IvectorExtractor>");
   WriteToken(os, binary, "<w>");
   w_.Write(os, binary);
   WriteToken(os, binary, "<w_vec>");
   w_vec_.Write(os, binary);
   WriteToken(os, binary, "<M>");
   int32 size = M_.size();
   WriteBasicType(os, binary, size);
   for (int32 i = 0; i < size; i++)
     M_[i].Write(os, binary);
   WriteToken(os, binary, "<SigmaInv>");
   KALDI_ASSERT(size == static_cast<int32>(Sigma_inv_.size()));
   for (int32 i = 0; i < size; i++)
     Sigma_inv_[i].Write(os, binary);
   WriteToken(os, binary, "<IvectorOffset>");
   WriteBasicType(os, binary, prior_offset_);
   WriteToken(os, binary, "</IvectorExtractor>");
 }

Friends And Related Function Documentation

◆ IvectorExtractorComputeDerivedVarsClass

friend class IvectorExtractorComputeDerivedVarsClass

friend

Definition at line 254 of file ivector-extractor.h.

◆ IvectorExtractorStats

friend class IvectorExtractorStats

friend

Definition at line 138 of file ivector-extractor.h.

◆ OnlineIvectorEstimationStats

friend class OnlineIvectorEstimationStats

friend

Definition at line 139 of file ivector-extractor.h.

Member Data Documentation

◆ gconsts_

Vector<double> gconsts_

protected

The constant term in the log-likelihood of each Gaussian (not counting any weight).

Definition at line 291 of file ivector-extractor.h.

Referenced by IvectorExtractor::ComputeDerivedVars(), and IvectorExtractor::GetAcousticAuxfGconst().

◆ M_

std::vector<Matrix<double> > M_

protected

Ivector-subspace projection matrices, dimension is [I][D][S].

The I'th matrix projects from ivector-space to Gaussian mean. There is no mean offset to add– we deal with it by having a prior with a nonzero mean.

Definition at line 276 of file ivector-extractor.h.

Referenced by IvectorExtractor::ComputeDerivedVars(), IvectorExtractor::FeatDim(), IvectorExtractor::GetAcousticAuxfMean(), IvectorExtractor::IvectorDim(), IvectorExtractor::IvectorExtractor(), IvectorExtractorStats::IvectorVarianceDiagnostic(), IvectorExtractor::NumGauss(), IvectorExtractor::TransformIvectors(), IvectorExtractorStats::UpdateProjection(), and IvectorExtractorStats::UpdateVariances().

◆ prior_offset_

double prior_offset_

protected

1st dim of the prior over the ivector has an offset, so it is not zero.

This is used to handle the global offset of the speaker-adapted means in a simple way.

Definition at line 284 of file ivector-extractor.h.

Referenced by OnlineIvectorEstimationStats::AccStats(), OnlineIvectorEstimationStats::DefaultObjf(), OnlineIvectorEstimationStats::GetIvector(), IvectorExtractor::GetIvectorDistPrior(), IvectorExtractor::GetPriorAuxf(), IvectorExtractor::IvectorExtractor(), OnlineIvectorEstimationStats::Read(), OnlineIvectorEstimationStats::Scale(), IvectorExtractor::TransformIvectors(), IvectorExtractorStats::UpdatePrior(), and OnlineIvectorEstimationStats::Write().

◆ Sigma_inv_

std::vector<SpMatrix<double> > Sigma_inv_

protected

Inverse variances of speaker-adapted model, dimension [I][D][D].

Definition at line 279 of file ivector-extractor.h.

Referenced by IvectorExtractor::ComputeDerivedVars(), IvectorExtractor::GetAcousticAuxfMean(), IvectorExtractor::GetAcousticAuxfVariance(), IvectorExtractor::IvectorExtractor(), IvectorExtractorStats::IvectorVarianceDiagnostic(), IvectorExtractorStats::UpdateProjection(), and IvectorExtractorStats::UpdateVariances().

◆ Sigma_inv_M_

std::vector<Matrix<double> > Sigma_inv_M_

protected

The product of Sigma_inv_[i] with M_[i].

Definition at line 301 of file ivector-extractor.h.

Referenced by OnlineIvectorEstimationStats::AccStats(), IvectorExtractor::ComputeDerivedVars(), and IvectorExtractor::GetIvectorDistMean().

◆ U_

Matrix<double> U_

protected

U_i = M_i^T ^{-1} M_i is a quantity that comes up in ivector estimation.

This is conceptually a std::vector<SpMatrix<double> >, but we store the packed-data in the rows of a matrix, which gives us an efficiency improvement (we can use matrix-multiplies).

Definition at line 298 of file ivector-extractor.h.

Referenced by OnlineIvectorEstimationStats::AccStats(), IvectorExtractor::ComputeDerivedVars(), IvectorExtractor::GetAcousticAuxfMean(), IvectorExtractor::GetIvectorDistMean(), and IvectorExtractorStats::GetOrthogonalIvectorTransform().

◆ w_

Matrix<double> w_

protected

Weight projection vectors, if used. Dimension is [I][S].

Definition at line 264 of file ivector-extractor.h.

Referenced by IvectorExtractorStats::CommitStatsForWPoint(), IvectorExtractor::GetAcousticAuxfWeight(), IvectorExtractor::GetIvectorDistWeight(), IvectorExtractor::IvectorExtractor(), IvectorExtractor::TransformIvectors(), and IvectorExtractorStats::UpdateWeight().

◆ w_vec_

Vector<double> w_vec_

protected

If we are not using weight-projection vectors, stores the Gaussian mixture weights from the UBM.

This does not affect the iVector; it is only useful as a way of making sure the log-probs are comparable between systems with and without weight projection matrices.

Definition at line 270 of file ivector-extractor.h.

Referenced by IvectorExtractor::GetAcousticAuxfWeight(), IvectorExtractorStats::GetOrthogonalIvectorTransform(), and IvectorExtractor::IvectorExtractor().

The documentation for this class was generated from the following files:

ivector/ivector-extractor.h
ivector/ivector-extractor.cc

Public Member Functions

Protected Member Functions

Protected Attributes

Static Private Member Functions

Friends

Detailed Description

Constructor & Destructor Documentation

◆ IvectorExtractor() [1/2]

◆ IvectorExtractor() [2/2]

Member Function Documentation

◆ ComputeDerivedVars() [1/2]

◆ ComputeDerivedVars() [2/2]

◆ FeatDim()

◆ GetAcousticAuxf()

◆ GetAcousticAuxfGconst()

◆ GetAcousticAuxfMean()

◆ GetAcousticAuxfVariance()

◆ GetAcousticAuxfWeight()

◆ GetAuxf()

◆ GetIvectorDistMean()

◆ GetIvectorDistPrior()

◆ GetIvectorDistribution()

◆ GetIvectorDistWeight()

◆ GetPriorAuxf()

◆ InvertWithFlooring()

◆ IvectorDependentWeights()

◆ IvectorDim()

◆ NumGauss()

◆ PriorOffset()

◆ Read()

◆ TransformIvectors()

◆ Write()

Friends And Related Function Documentation

◆ IvectorExtractorComputeDerivedVarsClass

◆ IvectorExtractorStats

◆ OnlineIvectorEstimationStats

Member Data Documentation

◆ gconsts_

◆ M_

◆ prior_offset_

◆ Sigma_inv_

◆ Sigma_inv_M_

◆ U_

◆ w_

◆ w_vec_