OnlineIvectorEstimationStats Class Reference

This class helps us to efficiently estimate iVectors in situations where the data is coming in frame by frame. More...

#include <ivector-extractor.h>

Collaboration diagram for OnlineIvectorEstimationStats:

Public Member Functions

 OnlineIvectorEstimationStats (int32 ivector_dim, BaseFloat prior_offset, BaseFloat max_count)
 
 OnlineIvectorEstimationStats (const OnlineIvectorEstimationStats &other)
 
void AccStats (const IvectorExtractor &extractor, const VectorBase< BaseFloat > &feature, const std::vector< std::pair< int32, BaseFloat > > &gauss_post)
 
void AccStats (const IvectorExtractor &extractor, const MatrixBase< BaseFloat > &features, const std::vector< std::vector< std::pair< int32, BaseFloat > > > &gauss_post)
 
int32 IvectorDim () const
 
void GetIvector (int32 num_cg_iters, VectorBase< double > *ivector) const
 This function gets the current estimate of the iVector. More...
 
double NumFrames () const
 
double PriorOffset () const
 
double ObjfChange (const VectorBase< double > &ivector) const
 ObjfChange returns the change in objective function *per frame* from using the default value [ prior_offset_, 0, 0, ... More...
 
double Count () const
 
void Scale (double scale)
 Scales the number of frames of stats by 0 <= scale <= 1, to make it as if we had fewer frames of adaptation data. More...
 
void Write (std::ostream &os, bool binary) const
 
void Read (std::istream &is, bool binary)
 
OnlineIvectorEstimationStatsoperator= (const OnlineIvectorEstimationStats &other)
 

Protected Member Functions

double Objf (const VectorBase< double > &ivector) const
 Returns objective function per frame, at this iVector value. More...
 
double DefaultObjf () const
 Returns objective function evaluated at the point [ prior_offset_, 0, 0, 0, ... More...
 

Protected Attributes

double prior_offset_
 
double max_count_
 
double num_frames_
 
SpMatrix< double > quadratic_term_
 
Vector< double > linear_term_
 

Friends

class IvectorExtractor
 

Detailed Description

This class helps us to efficiently estimate iVectors in situations where the data is coming in frame by frame.

Definition at line 314 of file ivector-extractor.h.

Constructor & Destructor Documentation

◆ OnlineIvectorEstimationStats() [1/2]

OnlineIvectorEstimationStats ( int32  ivector_dim,
BaseFloat  prior_offset,
BaseFloat  max_count 
)

Definition at line 786 of file ivector-extractor.cc.

References PackedMatrix< Real >::AddToDiag(), OnlineIvectorEstimationStats::linear_term_, and OnlineIvectorEstimationStats::quadratic_term_.

788  :
789  prior_offset_(prior_offset), max_count_(max_count), num_frames_(0.0),
790  quadratic_term_(ivector_dim), linear_term_(ivector_dim) {
791  if (ivector_dim != 0) {
792  linear_term_(0) += prior_offset;
794  }
795 }
void AddToDiag(const Real r)

◆ OnlineIvectorEstimationStats() [2/2]

Definition at line 797 of file ivector-extractor.cc.

798  :
799  prior_offset_(other.prior_offset_),
800  max_count_(other.max_count_),
801  num_frames_(other.num_frames_),
802  quadratic_term_(other.quadratic_term_),
803  linear_term_(other.linear_term_) { }

Member Function Documentation

◆ AccStats() [1/2]

void AccStats ( const IvectorExtractor extractor,
const VectorBase< BaseFloat > &  feature,
const std::vector< std::pair< int32, BaseFloat > > &  gauss_post 
)

Definition at line 537 of file ivector-extractor.cc.

References VectorBase< Real >::AddMatVec(), IvectorExtractor::IvectorDependentWeights(), IvectorExtractor::IvectorDim(), KALDI_ASSERT, kaldi::kTrans, IvectorExtractor::prior_offset_, IvectorExtractor::Sigma_inv_M_, and IvectorExtractor::U_.

Referenced by kaldi::TestIvectorExtraction().

540  {
541  KALDI_ASSERT(extractor.IvectorDim() == this->IvectorDim());
542  KALDI_ASSERT(!extractor.IvectorDependentWeights());
543 
544  Vector<double> feature_dbl(feature);
545  double tot_weight = 0.0;
546  int32 ivector_dim = this->IvectorDim(),
547  quadratic_term_dim = (ivector_dim * (ivector_dim + 1)) / 2;
548  SubVector<double> quadratic_term_vec(quadratic_term_.Data(),
549  quadratic_term_dim);
550 
551  for (size_t idx = 0; idx < gauss_post.size(); idx++) {
552  int32 g = gauss_post[idx].first;
553  double weight = gauss_post[idx].second;
554  // allow negative weights; it's needed in the online iVector extraction
555  // with speech-silence detection based on decoder traceback (we subtract
556  // stuff we previously added if the traceback changes).
557  if (weight == 0.0)
558  continue;
559  linear_term_.AddMatVec(weight, extractor.Sigma_inv_M_[g], kTrans,
560  feature_dbl, 1.0);
561  SubVector<double> U_g(extractor.U_, g);
562  quadratic_term_vec.AddVec(weight, U_g);
563  tot_weight += weight;
564  }
565  if (max_count_ > 0.0) {
566  // see comments in header RE max_count for explanation. It relates to
567  // prior scaling when the count exceeds max_count_
568  double old_num_frames = num_frames_,
569  new_num_frames = num_frames_ + tot_weight;
570  double old_prior_scale = std::max(old_num_frames, max_count_) / max_count_,
571  new_prior_scale = std::max(new_num_frames, max_count_) / max_count_;
572  // The prior_scales are the inverses of the scales we would put on the stats
573  // if we were implementing this by scaling the stats. Instead we
574  // scale the prior term.
575  double prior_scale_change = new_prior_scale - old_prior_scale;
576  if (prior_scale_change != 0.0) {
577  linear_term_(0) += prior_offset_ * prior_scale_change;
578  quadratic_term_.AddToDiag(prior_scale_change);
579  }
580  }
581  num_frames_ += tot_weight;
582 }
kaldi::int32 int32
void AddToDiag(const Real r)
void AddMatVec(const Real alpha, const MatrixBase< Real > &M, const MatrixTransposeType trans, const VectorBase< Real > &v, const Real beta)
Add matrix times vector : this <– beta*this + alpha*M*v.
Definition: kaldi-vector.cc:92
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ AccStats() [2/2]

void AccStats ( const IvectorExtractor extractor,
const MatrixBase< BaseFloat > &  features,
const std::vector< std::vector< std::pair< int32, BaseFloat > > > &  gauss_post 
)

Definition at line 611 of file ivector-extractor.cc.

References VectorBase< Real >::AddVec(), kaldi::ConvertPostToGaussInfo(), GaussInfo::frame_weights, IvectorExtractor::IvectorDependentWeights(), IvectorExtractor::IvectorDim(), KALDI_ASSERT, kaldi::kTrans, kaldi::kUndefined, MatrixBase< Real >::NumCols(), IvectorExtractor::prior_offset_, MatrixBase< Real >::Row(), VectorBase< Real >::SetZero(), IvectorExtractor::Sigma_inv_M_, GaussInfo::tot_weight, and IvectorExtractor::U_.

614  {
615  KALDI_ASSERT(extractor.IvectorDim() == this->IvectorDim());
616  KALDI_ASSERT(!extractor.IvectorDependentWeights());
617 
618  int32 feat_dim = features.NumCols();
619  std::unordered_map<int32, GaussInfo> gauss_info;
620  ConvertPostToGaussInfo(gauss_post, &gauss_info);
621 
622  Vector<double> weighted_feats(feat_dim, kUndefined);
623  double tot_weight = 0.0;
624  int32 ivector_dim = this->IvectorDim(),
625  quadratic_term_dim = (ivector_dim * (ivector_dim + 1)) / 2;
626  SubVector<double> quadratic_term_vec(quadratic_term_.Data(),
627  quadratic_term_dim);
628 
629  std::unordered_map<int32, GaussInfo>::const_iterator
630  iter = gauss_info.begin(), end = gauss_info.end();
631  for (; iter != end; ++iter) {
632  int32 gauss_idx = iter->first;
633  const GaussInfo &info = iter->second;
634 
635  weighted_feats.SetZero();
636  std::vector<std::pair<int32, BaseFloat> >::const_iterator
637  f_iter = info.frame_weights.begin(), f_end = info.frame_weights.end();
638  for (; f_iter != f_end; ++f_iter) {
639  int32 t = f_iter->first;
640  BaseFloat weight = f_iter->second;
641  weighted_feats.AddVec(weight, features.Row(t));
642  }
643  BaseFloat this_tot_weight = info.tot_weight;
644 
645  linear_term_.AddMatVec(1.0, extractor.Sigma_inv_M_[gauss_idx], kTrans,
646  weighted_feats, 1.0);
647  SubVector<double> U_g(extractor.U_, gauss_idx);
648  quadratic_term_vec.AddVec(this_tot_weight, U_g);
649  tot_weight += this_tot_weight;
650  }
651  if (max_count_ > 0.0) {
652  // see comments in header RE max_count for explanation. It relates to
653  // prior scaling when the count exceeds max_count_
654  double old_num_frames = num_frames_,
655  new_num_frames = num_frames_ + tot_weight;
656  double old_prior_scale = std::max(old_num_frames, max_count_) / max_count_,
657  new_prior_scale = std::max(new_num_frames, max_count_) / max_count_;
658  // The prior_scales are the inverses of the scales we would put on the stats
659  // if we were implementing this by scaling the stats. Instead we
660  // scale the prior term.
661  double prior_scale_change = new_prior_scale - old_prior_scale;
662  if (prior_scale_change != 0.0) {
663  linear_term_(0) += prior_offset_ * prior_scale_change;
664  quadratic_term_.AddToDiag(prior_scale_change);
665  }
666  }
667  num_frames_ += tot_weight;
668 }
kaldi::int32 int32
static void ConvertPostToGaussInfo(const std::vector< std::vector< std::pair< int32, BaseFloat > > > &gauss_post, std::unordered_map< int32, GaussInfo > *gauss_info)
void AddToDiag(const Real r)
float BaseFloat
Definition: kaldi-types.h:29
void AddMatVec(const Real alpha, const MatrixBase< Real > &M, const MatrixTransposeType trans, const VectorBase< Real > &v, const Real beta)
Add matrix times vector : this <– beta*this + alpha*M*v.
Definition: kaldi-vector.cc:92
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Count()

double Count ( ) const
inline

◆ DefaultObjf()

double DefaultObjf ( ) const
protected

Returns objective function evaluated at the point [ prior_offset_, 0, 0, 0, ...

]... this is used in diagnostics.

Definition at line 776 of file ivector-extractor.cc.

References IvectorExtractor::prior_offset_.

776  {
777  if (num_frames_ == 0.0) {
778  return 0.0;
779  } else {
780  double x = prior_offset_;
781  return (1.0 / num_frames_) * (-0.5 * quadratic_term_(0, 0) * x * x
782  + x * linear_term_(0));
783  }
784 }

◆ GetIvector()

void GetIvector ( int32  num_cg_iters,
VectorBase< double > *  ivector 
) const

This function gets the current estimate of the iVector.

Internally it does some work to compute it (currently matrix inversion, but we are doing to use Conjugate Gradient which will increase the speed). At entry, "ivector" must be a pointer to a vector dimension IvectorDim(), and free of NaN's. For faster estimation, you can set "num_cg_iters" to some value > 0, which will limit how many iterations of conjugate gradient we use to re-estimate the iVector; in this case, you should make sure *ivector is set at entry to a recently estimated iVector from the same utterance, which will give the CG a better starting point. If num_cg_iters is set to -1, it will compute the iVector exactly; if it's set to a positive number, the number of conjugate gradient iterations will be limited to that number. Note: the iVectors output still have a nonzero mean (first dim offset by PriorOffset()).

Definition at line 732 of file ivector-extractor.cc.

References VectorBase< Real >::Dim(), IvectorExtractor::IvectorDim(), KALDI_ASSERT, KALDI_VLOG, kaldi::LinearCgd(), LinearCgdOptions::max_iters, IvectorExtractor::prior_offset_, and VectorBase< Real >::SetZero().

Referenced by kaldi::TestIvectorExtraction().

734  {
735  KALDI_ASSERT(ivector != NULL && ivector->Dim() ==
736  this->IvectorDim());
737 
738  if (num_frames_ > 0.0) {
739  // could be done exactly as follows:
740  // SpMatrix<double> quadratic_inv(quadratic_term_);
741  // quadratic_inv.Invert();
742  // ivector->AddSpVec(1.0, quadratic_inv, linear_term_, 0.0);
743  if ((*ivector)(0) == 0.0)
744  (*ivector)(0) = prior_offset_; // better initial guess.
745  LinearCgdOptions opts;
746  opts.max_iters = num_cg_iters;
747  LinearCgd(opts, quadratic_term_, linear_term_, ivector);
748  } else {
749  // Use 'default' value.
750  ivector->SetZero();
751  (*ivector)(0) = prior_offset_;
752  }
753  KALDI_VLOG(4) << "Objective function improvement from estimating the "
754  << "iVector (vs. default value) is "
755  << ObjfChange(*ivector);
756 }
double ObjfChange(const VectorBase< double > &ivector) const
ObjfChange returns the change in objective function *per frame* from using the default value [ prior_...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
int32 LinearCgd(const LinearCgdOptions &opts, const SpMatrix< Real > &A, const VectorBase< Real > &b, VectorBase< Real > *x)
#define KALDI_VLOG(v)
Definition: kaldi-error.h:156

◆ IvectorDim()

int32 IvectorDim ( ) const
inline

Definition at line 337 of file ivector-extractor.h.

Referenced by OnlineIvectorFeature::SetAdaptationState().

337 { return linear_term_.Dim(); }
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64

◆ NumFrames()

double NumFrames ( ) const
inline

Definition at line 355 of file ivector-extractor.h.

◆ Objf()

double Objf ( const VectorBase< double > &  ivector) const
protected

Returns objective function per frame, at this iVector value.

Definition at line 765 of file ivector-extractor.cc.

References kaldi::VecSpVec(), and kaldi::VecVec().

766  {
767  if (num_frames_ == 0.0) {
768  return 0.0;
769  } else {
770  return (1.0 / num_frames_) * (-0.5 * VecSpVec(ivector, quadratic_term_,
771  ivector)
772  + VecVec(ivector, linear_term_));
773  }
774 }
Real VecSpVec(const VectorBase< Real > &v1, const SpMatrix< Real > &M, const VectorBase< Real > &v2)
Computes v1^T * M * v2.
Definition: sp-matrix.cc:964
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:37

◆ ObjfChange()

double ObjfChange ( const VectorBase< double > &  ivector) const

ObjfChange returns the change in objective function *per frame* from using the default value [ prior_offset_, 0, 0, ...

] to using the provided value; should be >= 0, if "ivector" is a value we estimated. This is for diagnostics.

Definition at line 758 of file ivector-extractor.cc.

References KALDI_ASSERT, and KALDI_ISNAN.

Referenced by OnlineIvectorFeature::ObjfImprPerFrame(), and kaldi::TestIvectorExtraction().

759  {
760  double ans = Objf(ivector) - DefaultObjf();
761  KALDI_ASSERT(!KALDI_ISNAN(ans));
762  return ans;
763 }
double Objf(const VectorBase< double > &ivector) const
Returns objective function per frame, at this iVector value.
#define KALDI_ISNAN
Definition: kaldi-math.h:72
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
double DefaultObjf() const
Returns objective function evaluated at the point [ prior_offset_, 0, 0, 0, ...

◆ operator=()

OnlineIvectorEstimationStats& operator= ( const OnlineIvectorEstimationStats other)
inline

Definition at line 376 of file ivector-extractor.h.

References OnlineIvectorEstimationStats::linear_term_, OnlineIvectorEstimationStats::max_count_, OnlineIvectorEstimationStats::num_frames_, OnlineIvectorEstimationStats::prior_offset_, and OnlineIvectorEstimationStats::quadratic_term_.

376  {
377  this->prior_offset_ = other.prior_offset_;
378  this->max_count_ = other.max_count_;
379  this->num_frames_ = other.num_frames_;
380  this->quadratic_term_=other.quadratic_term_;
381  this->linear_term_=other.linear_term_;
382  return *this;
383  }

◆ PriorOffset()

double PriorOffset ( ) const
inline

Definition at line 357 of file ivector-extractor.h.

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)

Definition at line 710 of file ivector-extractor.cc.

References kaldi::ExpectToken(), KALDI_ASSERT, IvectorExtractor::prior_offset_, kaldi::ReadBasicType(), and kaldi::ReadToken().

Referenced by OnlineIvectorExtractorAdaptationState::Read(), and IvectorExtractor::Read().

710  {
711  ExpectToken(is, binary, "<OnlineIvectorEstimationStats>");
712  ExpectToken(is, binary, "<PriorOffset>");
713  ReadBasicType(is, binary, &prior_offset_);
714  std::string tok;
715  ReadToken(is, binary, &tok);
716  if (tok == "<MaxCount>") {
717  ReadBasicType(is, binary, &max_count_);
718  ExpectToken(is, binary, "<NumFrames>");
719  ReadBasicType(is, binary, &num_frames_);
720  } else {
721  KALDI_ASSERT(tok == "<NumFrames>");
722  max_count_ = 0.0;
723  ReadBasicType(is, binary, &num_frames_);
724  }
725  ExpectToken(is, binary, "<QuadraticTerm>");
726  quadratic_term_.Read(is, binary);
727  ExpectToken(is, binary, "<LinearTerm>");
728  linear_term_.Read(is, binary);
729  ExpectToken(is, binary, "</OnlineIvectorEstimationStats>");
730 }
void Read(std::istream &in, bool binary, bool add=false)
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:191
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
void Read(std::istream &in, bool binary, bool add=false)
Read function using C++ streams.

◆ Scale()

void Scale ( double  scale)

Scales the number of frames of stats by 0 <= scale <= 1, to make it as if we had fewer frames of adaptation data.

Note: it does not apply the scaling to the prior term.

Definition at line 671 of file ivector-extractor.cc.

References KALDI_ASSERT, and IvectorExtractor::prior_offset_.

Referenced by OnlineIvectorExtractorAdaptationState::LimitFrames(), and IvectorExtractorUtteranceStats::Scale().

671  {
672  KALDI_ASSERT(scale >= 0.0 && scale <= 1.0);
673  double old_num_frames = num_frames_;
674  num_frames_ *= scale;
675  quadratic_term_.Scale(scale);
676  linear_term_.Scale(scale);
677 
678  // Scale back up the prior term, by adding in whatever we scaled down.
679  if (max_count_ == 0.0) {
680  linear_term_(0) += prior_offset_ * (1.0 - scale);
681  quadratic_term_.AddToDiag(1.0 - scale);
682  } else {
683  double new_num_frames = num_frames_;
684  double old_prior_scale =
685  scale * std::max(old_num_frames, max_count_) / max_count_,
686  new_prior_scale = std::max(new_num_frames, max_count_) / max_count_;
687  // old_prior_scale is the scale the prior term currently has in the stats,
688  // i.e. the previous scale times "scale" as we just scaled the stats.
689  // new_prior_scale is the scale we want the prior term to have.
690  linear_term_(0) += prior_offset_ * (new_prior_scale - old_prior_scale);
691  quadratic_term_.AddToDiag(new_prior_scale - old_prior_scale);
692  }
693 }
void Scale(Real c)
void AddToDiag(const Real r)
void Scale(Real alpha)
Multiplies all elements by this constant.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const

Definition at line 695 of file ivector-extractor.cc.

References IvectorExtractor::prior_offset_, kaldi::WriteBasicType(), and kaldi::WriteToken().

Referenced by OnlineIvectorExtractorAdaptationState::Write(), and IvectorExtractor::Write().

695  {
696  WriteToken(os, binary, "<OnlineIvectorEstimationStats>");
697  WriteToken(os, binary, "<PriorOffset>");
698  WriteBasicType(os, binary, prior_offset_);
699  WriteToken(os, binary, "<MaxCount>");
700  WriteBasicType(os, binary, max_count_);
701  WriteToken(os, binary, "<NumFrames>");
702  WriteBasicType(os, binary, num_frames_);
703  WriteToken(os, binary, "<QuadraticTerm>");
704  quadratic_term_.Write(os, binary);
705  WriteToken(os, binary, "<LinearTerm>");
706  linear_term_.Write(os, binary);
707  WriteToken(os, binary, "</OnlineIvectorEstimationStats>");
708 }
void Write(std::ostream &out, bool binary) const
void Write(std::ostream &Out, bool binary) const
Writes to C++ stream (option to write in binary).
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Friends And Related Function Documentation

◆ IvectorExtractor

friend class IvectorExtractor
friend

Definition at line 393 of file ivector-extractor.h.

Member Data Documentation

◆ linear_term_

◆ max_count_

double max_count_
protected

Definition at line 395 of file ivector-extractor.h.

Referenced by OnlineIvectorEstimationStats::operator=().

◆ num_frames_

double num_frames_
protected

Definition at line 396 of file ivector-extractor.h.

Referenced by OnlineIvectorEstimationStats::operator=().

◆ prior_offset_

double prior_offset_
protected

◆ quadratic_term_

SpMatrix<double> quadratic_term_
protected

The documentation for this class was generated from the following files: