MleSgmm2SpeakerAccs Class Reference

Class for the accumulators required to update the speaker vectors v_s. More...

#include <estimate-am-sgmm2.h>

Collaboration diagram for MleSgmm2SpeakerAccs:

Public Member Functions

 MleSgmm2SpeakerAccs (const AmSgmm2 &model, BaseFloat rand_prune_=1.0e-05)
 Initialize the object. Error if speaker subspace not set up. More...
 
void Clear ()
 Clear the statistics. More...
 
BaseFloat Accumulate (const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, int32 pdf_index, BaseFloat weight, Sgmm2PerSpkDerivedVars *spk_vars)
 Accumulate statistics. Returns per-frame log-likelihood. More...
 
BaseFloat AccumulateFromPosteriors (const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, const Matrix< BaseFloat > &posteriors, int32 pdf_index, Sgmm2PerSpkDerivedVars *spk_vars)
 Accumulate statistics, given posteriors. More...
 
void Update (const AmSgmm2 &model, BaseFloat min_count, Vector< BaseFloat > *v_s, BaseFloat *objf_impr_out, BaseFloat *count_out)
 Update speaker vector. More...
 

Private Member Functions

void UpdateNoU (Vector< BaseFloat > *v_s, BaseFloat *objf_impr_out, BaseFloat *count_out)
 
void UpdateWithU (const AmSgmm2 &model, Vector< BaseFloat > *v_s, BaseFloat *objf_impr_out, BaseFloat *count_out)
 

Private Attributes

Vector< double > y_s_
 Statistics for speaker adaptation (vectors), stored per-speaker. More...
 
Vector< double > gamma_s_
 gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I] More...
 
Vector< double > a_s_
 a_i^{(s)}. For SSGMM. More...
 
std::vector< SpMatrix< double > > H_spk_
 The following variable does not change per speaker, it just relates to the speaker subspace. More...
 
std::vector< Matrix< double > > NtransSigmaInv_
 N_i^T {i}^{-1}. Needed for y^{(s)}. More...
 
BaseFloat rand_prune_
 small constant to randomly prune tiny posteriors More...
 

Detailed Description

Class for the accumulators required to update the speaker vectors v_s.

Note: if you have multiple speakers you will want to initialize this just once and call Clear() after you're done with each speaker, rather than creating a new object for each speaker, since the initialization function does nontrivial work.

Definition at line 354 of file estimate-am-sgmm2.h.

Constructor & Destructor Documentation

◆ MleSgmm2SpeakerAccs()

MleSgmm2SpeakerAccs ( const AmSgmm2 model,
BaseFloat  rand_prune_ = 1.0e-05 
)

Initialize the object. Error if speaker subspace not set up.

Definition at line 1713 of file estimate-am-sgmm2.cc.

References MleSgmm2SpeakerAccs::a_s_, MleSgmm2SpeakerAccs::gamma_s_, AmSgmm2::GetNtransSigmaInv(), MleSgmm2SpeakerAccs::H_spk_, AmSgmm2::HasSpeakerDependentWeights(), rnnlm::i, KALDI_ASSERT, kaldi::kTrans, AmSgmm2::N_, MleSgmm2SpeakerAccs::NtransSigmaInv_, AmSgmm2::NumGauss(), Vector< Real >::Resize(), AmSgmm2::SigmaInv_, AmSgmm2::SpkSpaceDim(), and MleSgmm2SpeakerAccs::y_s_.

1715  : rand_prune_(prune) {
1716  KALDI_ASSERT(model.SpkSpaceDim() != 0);
1717  H_spk_.resize(model.NumGauss());
1718  for (int32 i = 0; i < model.NumGauss(); i++) {
1719  // Eq. (82): H_{i}^{spk} = N_{i}^T \Sigma_{i}^{-1} N_{i}
1720  H_spk_[i].Resize(model.SpkSpaceDim());
1721  H_spk_[i].AddMat2Sp(1.0, Matrix<double>(model.N_[i]),
1722  kTrans, SpMatrix<double>(model.SigmaInv_[i]), 0.0);
1723  }
1724 
1725  model.GetNtransSigmaInv(&NtransSigmaInv_);
1726 
1727  gamma_s_.Resize(model.NumGauss());
1728  y_s_.Resize(model.SpkSpaceDim());
1729  if (model.HasSpeakerDependentWeights())
1730  a_s_.Resize(model.NumGauss());
1731 }
Vector< double > a_s_
a_i^{(s)}. For SSGMM.
kaldi::int32 int32
BaseFloat rand_prune_
small constant to randomly prune tiny posteriors
void Resize(MatrixIndexT length, MatrixResizeType resize_type=kSetZero)
Set vector to a specified size (can be zero).
std::vector< Matrix< double > > NtransSigmaInv_
N_i^T {i}^{-1}. Needed for y^{(s)}.
Vector< double > y_s_
Statistics for speaker adaptation (vectors), stored per-speaker.
std::vector< SpMatrix< double > > H_spk_
The following variable does not change per speaker, it just relates to the speaker subspace...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
Vector< double > gamma_s_
gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I]

Member Function Documentation

◆ Accumulate()

BaseFloat Accumulate ( const AmSgmm2 model,
const Sgmm2PerFrameDerivedVars frame_vars,
int32  pdf_index,
BaseFloat  weight,
Sgmm2PerSpkDerivedVars spk_vars 
)

Accumulate statistics. Returns per-frame log-likelihood.

Definition at line 1740 of file estimate-am-sgmm2.cc.

References MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), AmSgmm2::ComponentPosteriors(), and MatrixBase< Real >::Scale().

Referenced by kaldi::AccumulateForUtterance().

1744  {
1745  // Calculate Gaussian posteriors and collect statistics
1746  Matrix<BaseFloat> posteriors;
1747  BaseFloat log_like = model.ComponentPosteriors(frame_vars, j2, spk_vars,
1748  &posteriors);
1749  posteriors.Scale(weight);
1750  AccumulateFromPosteriors(model, frame_vars, posteriors, j2, spk_vars);
1751  return log_like;
1752 }
float BaseFloat
Definition: kaldi-types.h:29
BaseFloat AccumulateFromPosteriors(const AmSgmm2 &model, const Sgmm2PerFrameDerivedVars &frame_vars, const Matrix< BaseFloat > &posteriors, int32 pdf_index, Sgmm2PerSpkDerivedVars *spk_vars)
Accumulate statistics, given posteriors.

◆ AccumulateFromPosteriors()

BaseFloat AccumulateFromPosteriors ( const AmSgmm2 model,
const Sgmm2PerFrameDerivedVars frame_vars,
const Matrix< BaseFloat > &  posteriors,
int32  pdf_index,
Sgmm2PerSpkDerivedVars spk_vars 
)

Accumulate statistics, given posteriors.

Returns total count accumulated, which may differ from posteriors.Sum() due to randomized pruning.

Definition at line 1755 of file estimate-am-sgmm2.cc.

References MleSgmm2SpeakerAccs::a_s_, VectorBase< Real >::AddMatVec(), VectorBase< Real >::AddVec(), VectorBase< Real >::Dim(), AmSgmm2::FeatureDim(), MleSgmm2SpeakerAccs::gamma_s_, AmSgmm2::GetDjms(), AmSgmm2::GetSubstateMean(), Sgmm2PerFrameDerivedVars::gselect, rnnlm::i, KALDI_ASSERT, kaldi::kNoTrans, MleSgmm2SpeakerAccs::NtransSigmaInv_, AmSgmm2::NumSubstatesForPdf(), AmSgmm2::Pdf2Group(), MleSgmm2SpeakerAccs::rand_prune_, kaldi::RandPrune(), AmSgmm2::SpkSpaceDim(), AmSgmm2::w_jmi_, Sgmm2PerFrameDerivedVars::xt, and MleSgmm2SpeakerAccs::y_s_.

Referenced by MleSgmm2SpeakerAccs::Accumulate(), and kaldi::AccumulateForUtterance().

1759  {
1760  double tot_count = 0.0;
1761  int32 feature_dim = model.FeatureDim(),
1762  spk_space_dim = model.SpkSpaceDim();
1763  KALDI_ASSERT(spk_space_dim != 0);
1764  const vector<int32> &gselect = frame_vars.gselect;
1765 
1766  // Intermediate variables
1767  Vector<double> xt_jmi(feature_dim), mu_jmi(feature_dim),
1768  zt_jmi(spk_space_dim);
1769  int32 num_substates = model.NumSubstatesForPdf(j2),
1770  j1 = model.Pdf2Group(j2);
1771  bool have_spk_dep_weights = (a_s_.Dim() != 0);
1772 
1773  for (int32 m = 0; m < num_substates; m++) {
1774  BaseFloat gammat_jm = 0.0;
1775  for (int32 ki = 0; ki < static_cast<int32>(gselect.size()); ki++) {
1776  int32 i = gselect[ki];
1777  // Eq. (39): gamma_{jmi}(t) = p (j, m, i|t)
1778  BaseFloat gammat_jmi = RandPrune(posteriors(ki, m), rand_prune_);
1779  if (gammat_jmi != 0.0) {
1780  gammat_jm += gammat_jmi;
1781  tot_count += gammat_jmi;
1782  model.GetSubstateMean(j1, m, i, &mu_jmi);
1783  xt_jmi.CopyFromVec(frame_vars.xt);
1784  xt_jmi.AddVec(-1.0, mu_jmi);
1785  // Eq. (48): z{jmi}(t) = N_{i}^{T} \Sigma_{i}^{-1} x_{jmi}(t)
1786  zt_jmi.AddMatVec(1.0, NtransSigmaInv_[i], kNoTrans, xt_jmi, 0.0);
1787  // Eq. (49): \gamma_{i}^{(s)} = \sum_{t\in\Tau(s), j, m} gamma_{jmi}
1788  gamma_s_(i) += gammat_jmi;
1789  // Eq. (50): y^{(s)} = \sum_{t, j, m, i} gamma_{jmi}(t) z_{jmi}(t)
1790  y_s_.AddVec(gammat_jmi, zt_jmi);
1791  }
1792  }
1793  if (have_spk_dep_weights) {
1794  KALDI_ASSERT(!model.w_jmi_.empty());
1795  BaseFloat d_jms = model.GetDjms(j1, m, spk_vars);
1796  if (d_jms == -1.0) d_jms = 1.0; // Explanation: d_jms is set to -1 when we didn't have
1797  // speaker vectors in training. We treat this the same as the speaker vector being
1798  // zero, and d_jms becomes 1 in this case.
1799  a_s_.AddVec(gammat_jm/d_jms, model.w_jmi_[j1].Row(m));
1800  }
1801  }
1802  return tot_count;
1803 }
Float RandPrune(Float post, BaseFloat prune_thresh, struct RandomState *state=NULL)
Definition: kaldi-math.h:174
Vector< double > a_s_
a_i^{(s)}. For SSGMM.
kaldi::int32 int32
BaseFloat rand_prune_
small constant to randomly prune tiny posteriors
float BaseFloat
Definition: kaldi-types.h:29
std::vector< Matrix< double > > NtransSigmaInv_
N_i^T {i}^{-1}. Needed for y^{(s)}.
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
Vector< double > y_s_
Statistics for speaker adaptation (vectors), stored per-speaker.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
Vector< double > gamma_s_
gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I]
void AddVec(const Real alpha, const VectorBase< OtherReal > &v)
Add vector : *this = *this + alpha * rv (with casting between floats and doubles) ...

◆ Clear()

void Clear ( )

Clear the statistics.

Definition at line 1733 of file estimate-am-sgmm2.cc.

References MleSgmm2SpeakerAccs::a_s_, VectorBase< Real >::Dim(), MleSgmm2SpeakerAccs::gamma_s_, VectorBase< Real >::SetZero(), and MleSgmm2SpeakerAccs::y_s_.

Referenced by main().

1733  {
1734  y_s_.SetZero();
1735  gamma_s_.SetZero();
1736  if (a_s_.Dim() != 0) a_s_.SetZero();
1737 }
Vector< double > a_s_
a_i^{(s)}. For SSGMM.
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
Vector< double > y_s_
Statistics for speaker adaptation (vectors), stored per-speaker.
void SetZero()
Set vector to all zeros.
Vector< double > gamma_s_
gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I]

◆ Update()

void Update ( const AmSgmm2 model,
BaseFloat  min_count,
Vector< BaseFloat > *  v_s,
BaseFloat objf_impr_out,
BaseFloat count_out 
)

Update speaker vector.

If v_s was empty, will assume it started as zero and will resize it to the speaker-subspace size.

Definition at line 1805 of file estimate-am-sgmm2.cc.

References MleSgmm2SpeakerAccs::a_s_, VectorBase< Real >::Dim(), MleSgmm2SpeakerAccs::gamma_s_, KALDI_WARN, VectorBase< Real >::Sum(), MleSgmm2SpeakerAccs::UpdateNoU(), and MleSgmm2SpeakerAccs::UpdateWithU().

Referenced by main().

1809  {
1810  double tot_gamma = gamma_s_.Sum();
1811  if (tot_gamma < min_count) {
1812  KALDI_WARN << "Updating speaker vectors, count is " << tot_gamma
1813  << " < " << min_count << "not updating.";
1814  if (objf_impr_out) *objf_impr_out = 0.0;
1815  if (count_out) *count_out = 0.0;
1816  return;
1817  }
1818  if (a_s_.Dim() == 0) // No speaker-dependent weights...
1819  UpdateNoU(v_s, objf_impr_out, count_out);
1820  else
1821  UpdateWithU(model, v_s, objf_impr_out, count_out);
1822 }
void UpdateNoU(Vector< BaseFloat > *v_s, BaseFloat *objf_impr_out, BaseFloat *count_out)
Vector< double > a_s_
a_i^{(s)}. For SSGMM.
#define KALDI_WARN
Definition: kaldi-error.h:150
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
Real Sum() const
Returns sum of the elements.
void UpdateWithU(const AmSgmm2 &model, Vector< BaseFloat > *v_s, BaseFloat *objf_impr_out, BaseFloat *count_out)
Vector< double > gamma_s_
gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I]

◆ UpdateNoU()

void UpdateNoU ( Vector< BaseFloat > *  v_s,
BaseFloat objf_impr_out,
BaseFloat count_out 
)
private

Definition at line 1826 of file estimate-am-sgmm2.cc.

References SpMatrix< Real >::AddSp(), VectorBase< Real >::CopyFromVec(), VectorBase< Real >::Dim(), MleSgmm2SpeakerAccs::gamma_s_, MleSgmm2SpeakerAccs::H_spk_, rnnlm::i, KALDI_ASSERT, KALDI_LOG, Vector< Real >::Resize(), kaldi::SolveQuadraticProblem(), VectorBase< Real >::Sum(), and MleSgmm2SpeakerAccs::y_s_.

Referenced by MleSgmm2SpeakerAccs::Update().

1828  {
1829  double tot_gamma = gamma_s_.Sum();
1830  KALDI_ASSERT(y_s_.Dim() != 0);
1831  int32 T = y_s_.Dim(); // speaker-subspace dim.
1832  int32 num_gauss = gamma_s_.Dim();
1833  if (v_s->Dim() != T) v_s->Resize(T); // will set it to zero.
1834 
1835  // Eq. (84): H^{(s)} = \sum_{i} \gamma_{i}(s) H_{i}^{spk}
1836  SpMatrix<double> H_s(T);
1837 
1838  for (int32 i = 0; i < num_gauss; i++)
1839  H_s.AddSp(gamma_s_(i), H_spk_[i]);
1840 
1841  // Don't make these options to SolveQuadraticProblem configurable...
1842  // they really don't make a difference at all unless the matrix in
1843  // question is singular, which wouldn't happen in this case.
1844  Vector<double> v_s_dbl(*v_s);
1845  double tot_objf_impr =
1846  SolveQuadraticProblem(H_s, y_s_, SolverOptions("v_s"), &v_s_dbl);
1847 
1848  v_s->CopyFromVec(v_s_dbl);
1849 
1850  KALDI_LOG << "*Objf impr for speaker vector is " << (tot_objf_impr / tot_gamma)
1851  << " over " << tot_gamma << " frames.";
1852 
1853  if (objf_impr_out) *objf_impr_out = tot_objf_impr;
1854  if (count_out) *count_out = tot_gamma;
1855 }
double SolveQuadraticProblem(const SpMatrix< double > &H, const VectorBase< double > &g, const SolverOptions &opts, VectorBase< double > *x)
Definition: sp-matrix.cc:635
kaldi::int32 int32
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
Real Sum() const
Returns sum of the elements.
Vector< double > y_s_
Statistics for speaker adaptation (vectors), stored per-speaker.
std::vector< SpMatrix< double > > H_spk_
The following variable does not change per speaker, it just relates to the speaker subspace...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
Vector< double > gamma_s_
gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I]
#define KALDI_LOG
Definition: kaldi-error.h:153

◆ UpdateWithU()

void UpdateWithU ( const AmSgmm2 model,
Vector< BaseFloat > *  v_s,
BaseFloat objf_impr_out,
BaseFloat count_out 
)
private

Definition at line 1858 of file estimate-am-sgmm2.cc.

References MleSgmm2SpeakerAccs::a_s_, VectorBase< Real >::Add(), VectorBase< Real >::AddMatVec(), SpMatrix< Real >::AddSp(), VectorBase< Real >::AddSpVec(), VectorBase< Real >::AddVec(), SpMatrix< Real >::AddVec2(), VectorBase< Real >::ApplyExp(), VectorBase< Real >::ApplyLog(), kaldi::ApproxEqual(), VectorBase< Real >::CopyFromVec(), VectorBase< Real >::Dim(), MleSgmm2SpeakerAccs::gamma_s_, MleSgmm2SpeakerAccs::H_spk_, rnnlm::i, KALDI_ASSERT, KALDI_LOG, KALDI_WARN, kaldi::kNoTrans, VectorBase< Real >::LogSumExp(), Vector< Real >::Resize(), MatrixBase< Real >::Row(), VectorBase< Real >::Scale(), kaldi::SolveQuadraticProblem(), VectorBase< Real >::Sum(), AmSgmm2::u_, kaldi::VecSpVec(), kaldi::VecVec(), and MleSgmm2SpeakerAccs::y_s_.

Referenced by MleSgmm2SpeakerAccs::Update().

1861  {
1862  double tot_gamma = gamma_s_.Sum();
1863  KALDI_ASSERT(y_s_.Dim() != 0);
1864  int32 T = y_s_.Dim(); // speaker-subspace dim.
1865  int32 num_gauss = gamma_s_.Dim();
1866  if (v_s_ptr->Dim() != T) v_s_ptr->Resize(T); // will set it to zero.
1867 
1868  // Eq. (84): H^{(s)} = \sum_{i} \gamma_{i}(s) H_{i}^{spk}
1869  SpMatrix<double> H_s(T);
1870 
1871  for (int32 i = 0; i < num_gauss; i++)
1872  H_s.AddSp(gamma_s_(i), H_spk_[i]);
1873 
1874  Vector<double> v_s(*v_s_ptr);
1875  int32 num_iters = 5, // don't set this to 1, as we discard last iter.
1876  num_backtracks = 0,
1877  max_backtracks = 10;
1878  Vector<double> auxf(num_iters);
1879  Matrix<double> v_s_per_iter(num_iters, T);
1880  // The update for v^{(s)} is the one described in the technical report
1881  // section 5.1 (eq. 33 and below).
1882 
1883  for (int32 iter = 0; iter < num_iters; iter++) { // converges very fast,
1884  // and each iteration is fast, so don't need to make this configurable.
1885  v_s_per_iter.Row(iter).CopyFromVec(v_s);
1886 
1887  SpMatrix<double> F(H_s); // the 2nd-order quadratic term on this iteration...
1888  // F^{(p)} in the techerport.
1889  Vector<double> g(y_s_); // g^{(p)} in the techreport.
1890  g.AddSpVec(-1.0, H_s, v_s, 1.0);
1891  Vector<double> log_b_is(num_gauss); // b_i^{(s)}, indexed by i.
1892  log_b_is.AddMatVec(1.0, Matrix<double>(model.u_), kNoTrans, v_s, 0.0);
1893  Vector<double> tilde_w_is(log_b_is);
1894  Vector<double> log_a_s_(a_s_);
1895  log_a_s_.ApplyLog();
1896  tilde_w_is.AddVec(1.0, log_a_s_);
1897  tilde_w_is.Add(-1.0 * tilde_w_is.LogSumExp()); // normalize.
1898  // currently tilde_w_is is in log form.
1899  auxf(iter) = VecVec(v_s, y_s_) - 0.5 * VecSpVec(v_s, H_s, v_s)
1900  + VecVec(gamma_s_, tilde_w_is); // "new" term (weights)
1901 
1902  if (iter > 0 && auxf(iter) < auxf(iter-1) &&
1903  !ApproxEqual(auxf(iter), auxf(iter-1))) { // auxf did not improve.
1904  // backtrack halfway, and do this iteration again.
1905  KALDI_WARN << "Backtracking in speaker vector update, on iter "
1906  << iter << ", auxfs are " << auxf(iter-1) << " -> "
1907  << auxf(iter);
1908  v_s.Scale(0.5);
1909  v_s.AddVec(0.5, v_s_per_iter.Row(iter-1));
1910  if (++num_backtracks >= max_backtracks) {
1911  KALDI_WARN << "Backtracked " << max_backtracks
1912  << " times in speaker-vector update.";
1913  // backtrack all the way, and terminate:
1914  v_s_per_iter.Row(num_iters-1).CopyFromVec(v_s_per_iter.Row(iter-1));
1915  // the following statement ensures we will get
1916  // the appropriate auxiliary-function.
1917  auxf(num_iters-1) = auxf(iter-1);
1918  break;
1919  }
1920  iter--;
1921  }
1922  tilde_w_is.ApplyExp();
1923  for (int32 i = 0; i < num_gauss; i++) {
1924  g.AddVec(gamma_s_(i) - tot_gamma * tilde_w_is(i), model.u_.Row(i));
1925  F.AddVec2(tot_gamma * tilde_w_is(i), model.u_.Row(i));
1926  }
1927  Vector<double> delta(v_s.Dim());
1928  SolveQuadraticProblem(F, g, SolverOptions("v_s"), &delta);
1929  v_s.AddVec(1.0, delta);
1930  }
1931  // so that we only accept things where the auxf has been checked, we
1932  // actually take the penultimate speaker-vector. --> don't set
1933  // num-iters = 1.
1934  v_s_ptr->CopyFromVec(v_s_per_iter.Row(num_iters-1));
1935 
1936  double auxf_change = auxf(num_iters-1) - auxf(0);
1937  KALDI_LOG << "*Objf impr for speaker vector is " << (auxf_change / tot_gamma)
1938  << " per frame, over " << tot_gamma << " frames.";
1939 
1940  if (objf_impr_out) *objf_impr_out = auxf_change;
1941  if (count_out) *count_out = tot_gamma;
1942 }
double SolveQuadraticProblem(const SpMatrix< double > &H, const VectorBase< double > &g, const SolverOptions &opts, VectorBase< double > *x)
Definition: sp-matrix.cc:635
Vector< double > a_s_
a_i^{(s)}. For SSGMM.
kaldi::int32 int32
Real VecSpVec(const VectorBase< Real > &v1, const SpMatrix< Real > &M, const VectorBase< Real > &v2)
Computes v1^T * M * v2.
Definition: sp-matrix.cc:964
#define KALDI_WARN
Definition: kaldi-error.h:150
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
Real Sum() const
Returns sum of the elements.
Vector< double > y_s_
Statistics for speaker adaptation (vectors), stored per-speaker.
std::vector< SpMatrix< double > > H_spk_
The following variable does not change per speaker, it just relates to the speaker subspace...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
Vector< double > gamma_s_
gamma_{i}^{(s)}. Per-speaker counts for each Gaussian. Dimension is [I]
#define KALDI_LOG
Definition: kaldi-error.h:153
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:37
static bool ApproxEqual(float a, float b, float relative_tolerance=0.001)
return abs(a - b) <= relative_tolerance * (abs(a)+abs(b)).
Definition: kaldi-math.h:265

Member Data Documentation

◆ a_s_

◆ gamma_s_

◆ H_spk_

std::vector< SpMatrix<double> > H_spk_
private

The following variable does not change per speaker, it just relates to the speaker subspace.

Eq. (82): H_{i}^{spk} = N_{i}^T {i}^{-1} N_{i}

Definition at line 411 of file estimate-am-sgmm2.h.

Referenced by MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleSgmm2SpeakerAccs::UpdateNoU(), and MleSgmm2SpeakerAccs::UpdateWithU().

◆ NtransSigmaInv_

std::vector< Matrix<double> > NtransSigmaInv_
private

N_i^T {i}^{-1}. Needed for y^{(s)}.

Definition at line 414 of file estimate-am-sgmm2.h.

Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), and MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs().

◆ rand_prune_

BaseFloat rand_prune_
private

small constant to randomly prune tiny posteriors

Definition at line 417 of file estimate-am-sgmm2.h.

Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors().

◆ y_s_

Vector<double> y_s_
private

Statistics for speaker adaptation (vectors), stored per-speaker.

Per-speaker stats for vectors, y^{(s)}. Dimension [T].

Definition at line 402 of file estimate-am-sgmm2.h.

Referenced by MleSgmm2SpeakerAccs::AccumulateFromPosteriors(), MleSgmm2SpeakerAccs::Clear(), MleSgmm2SpeakerAccs::MleSgmm2SpeakerAccs(), MleSgmm2SpeakerAccs::UpdateNoU(), and MleSgmm2SpeakerAccs::UpdateWithU().


The documentation for this class was generated from the following files: