RegtreeFmllrDiagGmmAccs Class Reference

Class for computing the accumulators needed for the maximum-likelihood estimate of FMLLR transforms for an acoustic model that uses diagonal Gaussian mixture models as emission densities. More...

#include <regtree-fmllr-diag-gmm.h>

Collaboration diagram for RegtreeFmllrDiagGmmAccs:

Public Member Functions

 RegtreeFmllrDiagGmmAccs ()
 
 ~RegtreeFmllrDiagGmmAccs ()
 
void Init (size_t num_bclass, size_t dim)
 
void SetZero ()
 
BaseFloat AccumulateForGmm (const RegressionTree &regtree, const AmDiagGmm &am, const VectorBase< BaseFloat > &data, size_t pdf_index, BaseFloat weight)
 Accumulate stats for a single GMM in the model; returns log likelihood. More...
 
void AccumulateForGaussian (const RegressionTree &regtree, const AmDiagGmm &am, const VectorBase< BaseFloat > &data, size_t pdf_index, size_t gauss_index, BaseFloat weight)
 Accumulate stats for a single Gaussian component in the model. More...
 
void Update (const RegressionTree &regtree, const RegtreeFmllrOptions &opts, RegtreeFmllrDiagGmm *out_fmllr, BaseFloat *auxf_impr, BaseFloat *tot_t) const
 
void Write (std::ostream &out_stream, bool binary) const
 
void Read (std::istream &in_stream, bool binary, bool add)
 
int32 Dim () const
 Accessors. More...
 
int32 NumBaseClasses () const
 
const std::vector< AffineXformStats * > & baseclass_stats () const
 

Private Member Functions

 KALDI_DISALLOW_COPY_AND_ASSIGN (RegtreeFmllrDiagGmmAccs)
 

Private Attributes

std::vector< AffineXformStats * > baseclass_stats_
 Per-baseclass stats; used for accumulation. More...
 
int32 num_baseclasses_
 Number of baseclasses. More...
 
int32 dim_
 Dimension of feature vectors. More...
 

Detailed Description

Class for computing the accumulators needed for the maximum-likelihood estimate of FMLLR transforms for an acoustic model that uses diagonal Gaussian mixture models as emission densities.

Definition at line 148 of file regtree-fmllr-diag-gmm.h.

Constructor & Destructor Documentation

◆ RegtreeFmllrDiagGmmAccs()

Definition at line 150 of file regtree-fmllr-diag-gmm.h.

150 : num_baseclasses_(-1), dim_(-1) {}
int32 dim_
Dimension of feature vectors.
int32 num_baseclasses_
Number of baseclasses.

◆ ~RegtreeFmllrDiagGmmAccs()

Definition at line 151 of file regtree-fmllr-diag-gmm.h.

References kaldi::DeletePointers().

void DeletePointers(std::vector< A *> *v)
Deletes any non-NULL pointers in the vector v, and sets the corresponding entries of v to NULL...
Definition: stl-utils.h:184
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.

Member Function Documentation

◆ AccumulateForGaussian()

void AccumulateForGaussian ( const RegressionTree regtree,
const AmDiagGmm am,
const VectorBase< BaseFloat > &  data,
size_t  pdf_index,
size_t  gauss_index,
BaseFloat  weight 
)

Accumulate stats for a single Gaussian component in the model.

Definition at line 261 of file regtree-fmllr-diag-gmm.cc.

References SpMatrix< Real >::AddVec2(), VectorBase< Real >::CopyRowFromMat(), rnnlm::d, RegtreeFmllrDiagGmm::dim_, RegressionTree::Gauss2BaseclassId(), AmDiagGmm::GetPdf(), DiagGmm::inv_vars(), DiagGmm::means_invvars(), and VectorBase< Real >::Range().

264  {
265  const DiagGmm &pdf = am.GetPdf(pdf_index);
266  size_t dim = static_cast<size_t>(dim_);
267  Vector<double> extended_data(dim+1);
268  extended_data.Range(0, dim).CopyFromVec(data);
269  extended_data(dim) = 1.0;
270  SpMatrix<double> scatter(dim+1);
271  scatter.AddVec2(1.0, extended_data);
272  double weight_d = static_cast<double>(weight);
273 
274  unsigned bclass = regtree.Gauss2BaseclassId(pdf_index, gauss_index);
275  Vector<double> inv_var_mean(dim_);
276  inv_var_mean.CopyRowFromMat(pdf.means_invvars(), gauss_index);
277 
278  baseclass_stats_[bclass]->beta_ += weight_d;
279  baseclass_stats_[bclass]->K_.AddVecVec(weight_d, inv_var_mean, extended_data);
280  vector< SpMatrix<double> > &G = baseclass_stats_[bclass]->G_;
281  for (size_t d = 0; d < dim; d++)
282  G[d].AddSp((weight_d * pdf.inv_vars()(gauss_index, d)), scatter);
283 }
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.
int32 dim_
Dimension of feature vectors.

◆ AccumulateForGmm()

BaseFloat AccumulateForGmm ( const RegressionTree regtree,
const AmDiagGmm am,
const VectorBase< BaseFloat > &  data,
size_t  pdf_index,
BaseFloat  weight 
)

Accumulate stats for a single GMM in the model; returns log likelihood.

This does not work if the features have already been transformed with multiple feature transforms (so you can't use use this to do a 2nd pass of regression-tree fMLLR estimation, which as I write (Dan, 2016) I'm not sure that this framework even supports.

Definition at line 224 of file regtree-fmllr-diag-gmm.cc.

References MatrixBase< Real >::AddSp(), SpMatrix< Real >::AddVec2(), DiagGmm::ComponentPosteriors(), VectorBase< Real >::CopyRowFromMat(), rnnlm::d, RegtreeFmllrDiagGmm::dim_, RegressionTree::Gauss2BaseclassId(), AmDiagGmm::GetPdf(), DiagGmm::inv_vars(), DiagGmm::means_invvars(), DiagGmm::NumGauss(), VectorBase< Real >::Range(), and VectorBase< Real >::Scale().

Referenced by main(), and kaldi::UnitTestRegtreeFmllrDiagGmm().

226  {
227  const DiagGmm &pdf = am.GetPdf(pdf_index);
228  int32 num_comp = pdf.NumGauss();
229  Vector<BaseFloat> posterior(num_comp);
230  BaseFloat loglike = pdf.ComponentPosteriors(data, &posterior);
231  posterior.Scale(weight);
232  Vector<double> posterior_d(posterior);
233 
234  Vector<double> extended_data(dim_+1);
235  extended_data.Range(0, dim_).CopyFromVec(data);
236  extended_data(dim_) = 1.0;
237  SpMatrix<double> scatter(dim_+1);
238  scatter.AddVec2(1.0, extended_data);
239 
240  Vector<double> inv_var_mean(dim_);
241  Matrix<double> g_scale(baseclass_stats_.size(), dim_); // scale on "scatter" for each dim.
242  for (int32 m = 0; m < num_comp; m++) {
243  inv_var_mean.CopyRowFromMat(pdf.means_invvars(), m);
244  int32 bclass = regtree.Gauss2BaseclassId(pdf_index, m);
245 
246  baseclass_stats_[bclass]->beta_ += posterior_d(m);
247  baseclass_stats_[bclass]->K_.AddVecVec(posterior_d(m), inv_var_mean,
248  extended_data);
249  for (int32 d = 0; d < dim_; d++)
250  g_scale(bclass, d) += posterior(m) * pdf.inv_vars()(m, d);
251  }
252  for (size_t bclass = 0; bclass < baseclass_stats_.size(); bclass++) {
253  vector< SpMatrix<double> > &G = baseclass_stats_[bclass]->G_;
254  for (int32 d = 0; d < dim_; d++)
255  if (g_scale(bclass, d) != 0.0)
256  G[d].AddSp(g_scale(bclass, d), scatter);
257  }
258  return loglike;
259 }
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.
int32 dim_
Dimension of feature vectors.
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29

◆ baseclass_stats()

const std::vector<AffineXformStats*>& baseclass_stats ( ) const
inline

Definition at line 183 of file regtree-fmllr-diag-gmm.h.

183  {
184  return baseclass_stats_;
185  }
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.

◆ Dim()

int32 Dim ( ) const
inline

Accessors.

Definition at line 181 of file regtree-fmllr-diag-gmm.h.

181 { return dim_; }
int32 dim_
Dimension of feature vectors.

◆ Init()

void Init ( size_t  num_bclass,
size_t  dim 
)

Definition at line 197 of file regtree-fmllr-diag-gmm.cc.

References kaldi::DeletePointers(), RegtreeFmllrDiagGmm::dim_, and KALDI_ASSERT.

Referenced by main(), and kaldi::UnitTestRegtreeFmllrDiagGmm().

197  {
198  if (num_bclass == 0) { // empty stats
200  baseclass_stats_.clear();
201  num_baseclasses_ = 0;
202  dim_ = 0; // non-zero dimension is meaningless in empty stats
203  } else {
204  KALDI_ASSERT(dim != 0); // if not empty, dim = 0 is meaningless
205  num_baseclasses_ = num_bclass;
206  dim_ = dim;
208  baseclass_stats_.resize(num_bclass);
209  for (vector<AffineXformStats*>::iterator it = baseclass_stats_.begin(),
210  end = baseclass_stats_.end(); it != end; ++it) {
211  *it = new AffineXformStats();
212  (*it)->Init(dim, dim);
213  }
214  }
215 }
void DeletePointers(std::vector< A *> *v)
Deletes any non-NULL pointers in the vector v, and sets the corresponding entries of v to NULL...
Definition: stl-utils.h:184
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.
int32 dim_
Dimension of feature vectors.
int32 num_baseclasses_
Number of baseclasses.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( RegtreeFmllrDiagGmmAccs  )
private

◆ NumBaseClasses()

int32 NumBaseClasses ( ) const
inline

Definition at line 182 of file regtree-fmllr-diag-gmm.h.

182 { return num_baseclasses_; }
int32 num_baseclasses_
Number of baseclasses.

◆ Read()

void Read ( std::istream &  in_stream,
bool  binary,
bool  add 
)

Definition at line 299 of file regtree-fmllr-diag-gmm.cc.

References RegtreeFmllrDiagGmm::dim_, kaldi::ExpectToken(), KALDI_ASSERT, and kaldi::ReadBasicType().

299  {
300  ExpectToken(in, binary, "<FMLLRACCS>");
301  ExpectToken(in, binary, "<NUMBASECLASSES>");
302  ReadBasicType(in, binary, &num_baseclasses_);
303  ExpectToken(in, binary, "<DIMENSION>");
304  ReadBasicType(in, binary, &dim_);
305  KALDI_ASSERT(num_baseclasses_ > 0 && dim_ > 0);
307  ExpectToken(in, binary, "<STATS>");
308  vector<AffineXformStats*>::iterator itr = baseclass_stats_.begin(),
309  end = baseclass_stats_.end();
310  for ( ; itr != end; ++itr) {
311  *itr = new AffineXformStats();
312  (*itr)->Init(dim_, dim_);
313  (*itr)->Read(in, binary, add);
314  }
315  ExpectToken(in, binary, "</FMLLRACCS>");
316 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.
int32 dim_
Dimension of feature vectors.
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:191
int32 num_baseclasses_
Number of baseclasses.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ SetZero()

void SetZero ( )

Definition at line 217 of file regtree-fmllr-diag-gmm.cc.

Referenced by main(), and kaldi::UnitTestRegtreeFmllrDiagGmm().

217  {
218  for (vector<AffineXformStats*>::iterator it = baseclass_stats_.begin(),
219  end = baseclass_stats_.end(); it != end; ++it) {
220  (*it)->SetZero();
221  }
222 }
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.

◆ Update()

void Update ( const RegressionTree regtree,
const RegtreeFmllrOptions opts,
RegtreeFmllrDiagGmm out_fmllr,
BaseFloat auxf_impr,
BaseFloat tot_t 
) const

Definition at line 319 of file regtree-fmllr-diag-gmm.cc.

References kaldi::ComputeFmllrMatrixDiagGmmDiagonal(), kaldi::ComputeFmllrMatrixDiagGmmFull(), kaldi::ComputeFmllrMatrixDiagGmmOffset(), kaldi::DeletePointers(), RegtreeFmllrDiagGmm::dim_, RegressionTree::GatherStats(), RegtreeFmllrDiagGmm::Init(), KALDI_ASSERT, KALDI_ERR, KALDI_LOG, KALDI_WARN, RegtreeFmllrOptions::min_count, RegtreeFmllrOptions::num_iters, RegtreeFmllrDiagGmm::set_bclass2xforms(), RegtreeFmllrDiagGmm::SetParameters(), MatrixBase< Real >::SetUnit(), RegtreeFmllrOptions::update_type, and RegtreeFmllrOptions::use_regtree.

Referenced by main(), and kaldi::UnitTestRegtreeFmllrDiagGmm().

323  {
324  BaseFloat tot_auxf_impr = 0.0, tot_t = 0.0;
325  Matrix<BaseFloat> xform_mat(dim_, dim_+1);
326  if (opts.use_regtree) { // estimate transforms using a regression tree
327  vector<AffineXformStats*> regclass_stats;
328  vector<int32> base2regclass;
329  bool update_xforms = regtree.GatherStats(baseclass_stats_, opts.min_count,
330  &base2regclass, &regclass_stats);
331  out_fmllr->set_bclass2xforms(base2regclass);
332  // If update_xforms == true, none should be negative, else all should be -1
333  if (update_xforms) {
334  out_fmllr->Init(regclass_stats.size(), dim_);
335  size_t num_rclass = regclass_stats.size();
336  for (size_t rclass_index = 0;
337  rclass_index < num_rclass; ++rclass_index) {
338  KALDI_ASSERT(regclass_stats[rclass_index]->beta_ >= opts.min_count);
339  xform_mat.SetUnit();
340  tot_t += regclass_stats[rclass_index]->beta_;
341 
342  tot_auxf_impr +=
343  ComputeFmllrMatrixDiagGmmFull(xform_mat, *(regclass_stats[rclass_index]),
344  opts.num_iters, &xform_mat);
345 
346  out_fmllr->SetParameters(xform_mat, rclass_index);
347  }
348  KALDI_LOG << "Estimated " << num_rclass << " regression classes.";
349  } else {
350  out_fmllr->Init(1, dim_); // Use a unit transform at the root.
351  }
352  DeletePointers(&regclass_stats);
353  // end of estimation using regression tree
354  } else { // No regtree: estimate 1 transform per baseclass (if enough count)
355  for (int32 bclass_index = 0; bclass_index < num_baseclasses_;
356  ++bclass_index) {
357  tot_t += baseclass_stats_[bclass_index]->beta_;
358  }
359 
360  out_fmllr->Init(num_baseclasses_, dim_);
361  vector<int32> base2regclass(num_baseclasses_);
362  for (int32 bclass_index = 0; bclass_index < num_baseclasses_;
363  ++bclass_index) {
364  if (baseclass_stats_[bclass_index]->beta_ >= opts.min_count) {
365  xform_mat.SetUnit();
366 
367  if (opts.update_type == "full") {
368  tot_auxf_impr +=
370  *(baseclass_stats_[bclass_index]),
371  opts.num_iters, &xform_mat);
372  } else if (opts.update_type == "diag")
373  tot_auxf_impr +=
375  *(baseclass_stats_[bclass_index]),
376  &xform_mat);
377  else if (opts.update_type == "offset")
378  tot_auxf_impr +=
380  *(baseclass_stats_[bclass_index]),
381  &xform_mat);
382  else if (opts.update_type == "none")
383  tot_auxf_impr = 0.0;
384  else
385  KALDI_ERR << "Unknown fMLLR update type " << opts.update_type
386  << ", fmllr-update-type must be one of \"full\"|\"diag\"|\"offset\"|\"none\"";
387 
388  out_fmllr->SetParameters(xform_mat, bclass_index);
389  base2regclass[bclass_index] = bclass_index;
390  } else {
391  KALDI_WARN << "For baseclass " << (bclass_index) << " count = "
392  << (baseclass_stats_[bclass_index]->beta_) << " < "
393  << opts.min_count << ": not updating FMLLR";
394  base2regclass[bclass_index] = -1;
395  }
396  out_fmllr->set_bclass2xforms(base2regclass);
397  } // end looping over all baseclasses
398  } // end of estimating one transform per baseclass without regtree
399  if (auxf_impr_out) *auxf_impr_out = tot_auxf_impr;
400  if (tot_t_out) *tot_t_out = tot_t;
401 }
void DeletePointers(std::vector< A *> *v)
Deletes any non-NULL pointers in the vector v, and sets the corresponding entries of v to NULL...
Definition: stl-utils.h:184
BaseFloat ComputeFmllrMatrixDiagGmmFull(const MatrixBase< BaseFloat > &in_xform, const AffineXformStats &stats, int32 num_iters, MatrixBase< BaseFloat > *out_xform)
Updates the FMLLR matrix using Mark Gales&#39; row-by-row update.
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.
int32 dim_
Dimension of feature vectors.
kaldi::int32 int32
BaseFloat ComputeFmllrMatrixDiagGmmOffset(const MatrixBase< BaseFloat > &in_xform, const AffineXformStats &stats, MatrixBase< BaseFloat > *out_xform)
This does offset-only fMLLR, i.e. it only estimates an offset.
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_WARN
Definition: kaldi-error.h:150
BaseFloat ComputeFmllrMatrixDiagGmmDiagonal(const MatrixBase< BaseFloat > &in_xform, const AffineXformStats &stats, MatrixBase< BaseFloat > *out_xform)
This does diagonal fMLLR (i.e.
int32 num_baseclasses_
Number of baseclasses.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define KALDI_LOG
Definition: kaldi-error.h:153

◆ Write()

void Write ( std::ostream &  out_stream,
bool  binary 
) const

Definition at line 285 of file regtree-fmllr-diag-gmm.cc.

References RegtreeFmllrDiagGmm::dim_, kaldi::WriteBasicType(), and kaldi::WriteToken().

285  {
286  WriteToken(out, binary, "<FMLLRACCS>");
287  WriteToken(out, binary, "<NUMBASECLASSES>");
288  WriteBasicType(out, binary, num_baseclasses_);
289  WriteToken(out, binary, "<DIMENSION>");
290  WriteBasicType(out, binary, dim_);
291  WriteToken(out, binary, "<STATS>");
292  vector<AffineXformStats*>::const_iterator itr = baseclass_stats_.begin(),
293  end = baseclass_stats_.end();
294  for ( ; itr != end; ++itr)
295  (*itr)->Write(out, binary);
296  WriteToken(out, binary, "</FMLLRACCS>");
297 }
std::vector< AffineXformStats * > baseclass_stats_
Per-baseclass stats; used for accumulation.
int32 dim_
Dimension of feature vectors.
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
int32 num_baseclasses_
Number of baseclasses.
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation

◆ baseclass_stats_

std::vector<AffineXformStats*> baseclass_stats_
private

Per-baseclass stats; used for accumulation.

Definition at line 189 of file regtree-fmllr-diag-gmm.h.

◆ dim_

int32 dim_
private

Dimension of feature vectors.

Definition at line 193 of file regtree-fmllr-diag-gmm.h.

◆ num_baseclasses_

int32 num_baseclasses_
private

Number of baseclasses.

Definition at line 191 of file regtree-fmllr-diag-gmm.h.


The documentation for this class was generated from the following files: