OnlineCmvn Class Reference

This class does an online version of the cepstral mean and [optionally] variance, but note that this is not equivalent to the offline version. More...

#include <online-feature.h>

Inheritance diagram for OnlineCmvn:
Collaboration diagram for OnlineCmvn:

Public Member Functions

virtual int32 Dim () const
 
virtual bool IsLastFrame (int32 frame) const
 Returns true if this is the last frame. More...
 
virtual BaseFloat FrameShiftInSeconds () const
 
virtual int32 NumFramesReady () const
 returns the feature dimension. More...
 
virtual void GetFrame (int32 frame, VectorBase< BaseFloat > *feat)
 Gets the feature vector for this frame. More...
 
 OnlineCmvn (const OnlineCmvnOptions &opts, const OnlineCmvnState &cmvn_state, OnlineFeatureInterface *src)
 Initializer that sets the cmvn state. More...
 
 OnlineCmvn (const OnlineCmvnOptions &opts, OnlineFeatureInterface *src)
 Initializer that does not set the cmvn state: after calling this, you should call SetState(). More...
 
void GetState (int32 cur_frame, OnlineCmvnState *cmvn_state)
 
void SetState (const OnlineCmvnState &cmvn_state)
 
void Freeze (int32 cur_frame)
 
virtual ~OnlineCmvn ()
 
- Public Member Functions inherited from OnlineFeatureInterface
virtual void GetFrames (const std::vector< int32 > &frames, MatrixBase< BaseFloat > *feats)
 This is like GetFrame() but for a collection of frames. More...
 
virtual ~OnlineFeatureInterface ()
 Virtual destructor. More...
 

Private Member Functions

void GetMostRecentCachedFrame (int32 frame, int32 *cached_frame, MatrixBase< double > *stats)
 Get the most recent cached frame of CMVN stats. More...
 
void CacheFrame (int32 frame, const MatrixBase< double > &stats)
 Cache this frame of stats. More...
 
void InitRingBufferIfNeeded ()
 Initialize ring buffer for caching stats. More...
 
void ComputeStatsForFrame (int32 frame, MatrixBase< double > *stats)
 Computes the raw CMVN stats for this frame, making use of (and updating if necessary) the cached statistics in raw_stats_. More...
 

Static Private Member Functions

static void SmoothOnlineCmvnStats (const MatrixBase< double > &speaker_stats, const MatrixBase< double > &global_stats, const OnlineCmvnOptions &opts, MatrixBase< double > *stats)
 Smooth the CMVN stats "stats" (which are stored in the normal format as a 2 x (dim+1) matrix), by possibly adding some stats from "global_stats" and/or "speaker_stats", controlled by the config. More...
 

Private Attributes

OnlineCmvnOptions opts_
 
std::vector< int32skip_dims_
 
OnlineCmvnState orig_state_
 
Matrix< double > frozen_state_
 
std::vector< Matrix< double > * > cached_stats_modulo_
 
std::vector< std::pair< int32, Matrix< double > > > cached_stats_ring_
 
Matrix< double > temp_stats_
 
Vector< BaseFloattemp_feats_
 
Vector< double > temp_feats_dbl_
 
OnlineFeatureInterfacesrc_
 

Detailed Description

This class does an online version of the cepstral mean and [optionally] variance, but note that this is not equivalent to the offline version.

This is necessarily so, as the offline computation involves looking into the future. If you plan to use features normalized with this type of CMVN then you need to train in a `matched' way, i.e. with the same type of features. We normally only do so in the "online" GMM-based decoding, e.g. in online2bin/online2-wav-gmm-latgen-faster.cc; see also the script steps/online/prepare_online_decoding.sh and steps/online/decode.sh.

In the steady state (in the middle of a long utterance), this class accumulates CMVN statistics from the previous "cmn_window" frames (default 600 frames, or 6 seconds), and uses these to normalize the mean and possibly variance of the current frame.

The config variables "speaker_frames" and "global_frames" relate to what happens at the beginning of the utterance when we have seen fewer than "cmn_window" frames of context, and so might not have very good stats to normalize with. Basically, we first augment any existing stats with up to "speaker_frames" frames of stats from previous utterances of the current speaker, and if this doesn't take us up to the required "cmn_window" frame count, we further augment with up to "global_frames" frames of global stats. The global stats are CMVN stats accumulated from training or testing data, that give us a reasonable source of mean and variance for "typical" data.

Definition at line 321 of file online-feature.h.

Constructor & Destructor Documentation

◆ OnlineCmvn() [1/2]

OnlineCmvn ( const OnlineCmvnOptions opts,
const OnlineCmvnState cmvn_state,
OnlineFeatureInterface src 
)

Initializer that sets the cmvn state.

If you don't have previous utterances from the same speaker you are supposed to initialize the CMVN state from some global CMVN stats, which you can get from summing all cmvn stats you have in your training data using "sum-matrix". This just gives it a reasonable starting point at the start of the file. If you do have previous utterances from the same speaker or at least a similar environment, you are supposed to initialize it by calling GetState from the previous utterance

Definition at line 238 of file online-feature.cc.

References KALDI_ERR, OnlineCmvn::SetState(), OnlineCmvnOptions::skip_dims, OnlineCmvn::skip_dims_, and kaldi::SplitStringToIntegers().

240  :
241  opts_(opts), temp_stats_(2, src->Dim() + 1),
242  temp_feats_(src->Dim()), temp_feats_dbl_(src->Dim()),
243  src_(src) {
244  SetState(cmvn_state);
245  if (!SplitStringToIntegers(opts.skip_dims, ":", false, &skip_dims_))
246  KALDI_ERR << "Bad --skip-dims option (should be colon-separated list of "
247  << "integers)";
248 }
bool SplitStringToIntegers(const std::string &full, const char *delim, bool omit_empty_strings, std::vector< I > *out)
Split a string (e.g.
Definition: text-utils.h:68
Vector< double > temp_feats_dbl_
OnlineFeatureInterface * src_
Matrix< double > temp_stats_
Vector< BaseFloat > temp_feats_
void SetState(const OnlineCmvnState &cmvn_state)
#define KALDI_ERR
Definition: kaldi-error.h:147
std::vector< int32 > skip_dims_
OnlineCmvnOptions opts_

◆ OnlineCmvn() [2/2]

OnlineCmvn ( const OnlineCmvnOptions opts,
OnlineFeatureInterface src 
)

Initializer that does not set the cmvn state: after calling this, you should call SetState().

Definition at line 250 of file online-feature.cc.

References KALDI_ERR, OnlineCmvnOptions::skip_dims, OnlineCmvn::skip_dims_, and kaldi::SplitStringToIntegers().

251  :
252  opts_(opts), temp_stats_(2, src->Dim() + 1),
253  temp_feats_(src->Dim()), temp_feats_dbl_(src->Dim()),
254  src_(src) {
255  if (!SplitStringToIntegers(opts.skip_dims, ":", false, &skip_dims_))
256  KALDI_ERR << "Bad --skip-dims option (should be colon-separated list of "
257  << "integers)";
258 }
bool SplitStringToIntegers(const std::string &full, const char *delim, bool omit_empty_strings, std::vector< I > *out)
Split a string (e.g.
Definition: text-utils.h:68
Vector< double > temp_feats_dbl_
OnlineFeatureInterface * src_
Matrix< double > temp_stats_
Vector< BaseFloat > temp_feats_
#define KALDI_ERR
Definition: kaldi-error.h:147
std::vector< int32 > skip_dims_
OnlineCmvnOptions opts_

◆ ~OnlineCmvn()

~OnlineCmvn ( )
virtual

Definition at line 331 of file online-feature.cc.

References OnlineCmvn::cached_stats_modulo_, and rnnlm::i.

331  {
332  for (size_t i = 0; i < cached_stats_modulo_.size(); i++)
333  delete cached_stats_modulo_[i];
334  cached_stats_modulo_.clear();
335 }
std::vector< Matrix< double > * > cached_stats_modulo_

Member Function Documentation

◆ CacheFrame()

void CacheFrame ( int32  frame,
const MatrixBase< double > &  stats 
)
private

Cache this frame of stats.

Definition at line 305 of file online-feature.cc.

References OnlineCmvn::cached_stats_modulo_, OnlineCmvn::cached_stats_ring_, OnlineCmvn::InitRingBufferIfNeeded(), KALDI_ASSERT, KALDI_WARN, OnlineCmvnOptions::modulus, rnnlm::n, and OnlineCmvn::opts_.

Referenced by OnlineCmvn::ComputeStatsForFrame().

305  {
306  KALDI_ASSERT(frame >= 0);
307  if (frame % opts_.modulus == 0) { // store in cached_stats_modulo_.
308  int32 n = frame / opts_.modulus;
309  if (n >= cached_stats_modulo_.size()) {
310  // The following assert is a limitation on in what order you can call
311  // CacheFrame. Fortunately the calling code always calls it in sequence,
312  // which it has to because you need a previous frame to compute the
313  // current one.
314  KALDI_ASSERT(n == cached_stats_modulo_.size());
315  cached_stats_modulo_.push_back(new Matrix<double>(stats));
316  } else {
317  KALDI_WARN << "Did not expect to reach this part of code.";
318  // do what seems right, but we shouldn't get here.
319  cached_stats_modulo_[n]->CopyFromMat(stats);
320  }
321  } else { // store in the ring buffer.
323  if (!cached_stats_ring_.empty()) {
324  int32 index = frame % cached_stats_ring_.size();
325  cached_stats_ring_[index].first = frame;
326  cached_stats_ring_[index].second.CopyFromMat(stats);
327  }
328  }
329 }
kaldi::int32 int32
void InitRingBufferIfNeeded()
Initialize ring buffer for caching stats.
std::vector< std::pair< int32, Matrix< double > > > cached_stats_ring_
struct rnnlm::@11::@12 n
#define KALDI_WARN
Definition: kaldi-error.h:150
OnlineCmvnOptions opts_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
std::vector< Matrix< double > * > cached_stats_modulo_

◆ ComputeStatsForFrame()

void ComputeStatsForFrame ( int32  frame,
MatrixBase< double > *  stats 
)
private

Computes the raw CMVN stats for this frame, making use of (and updating if necessary) the cached statistics in raw_stats_.

This means the (x, x^2, count) stats for the last up to opts_.cmn_window frames.

Definition at line 337 of file online-feature.cc.

References OnlineCmvn::CacheFrame(), OnlineCmvnOptions::cmn_window, VectorBase< Real >::CopyFromVec(), OnlineCmvn::Dim(), OnlineFeatureInterface::GetFrame(), OnlineCmvn::GetMostRecentCachedFrame(), KALDI_ASSERT, OnlineCmvnOptions::normalize_variance, OnlineCmvn::NumFramesReady(), OnlineCmvn::opts_, MatrixBase< Real >::Row(), OnlineCmvn::src_, OnlineCmvn::temp_feats_, and OnlineCmvn::temp_feats_dbl_.

Referenced by OnlineCmvn::Freeze(), and OnlineCmvn::GetFrame().

338  {
339  KALDI_ASSERT(frame >= 0 && frame < src_->NumFramesReady());
340 
341  int32 dim = this->Dim(), cur_frame;
342  GetMostRecentCachedFrame(frame, &cur_frame, stats_out);
343 
344  Vector<BaseFloat> &feats(temp_feats_);
345  Vector<double> &feats_dbl(temp_feats_dbl_);
346  while (cur_frame < frame) {
347  cur_frame++;
348  src_->GetFrame(cur_frame, &feats);
349  feats_dbl.CopyFromVec(feats);
350  stats_out->Row(0).Range(0, dim).AddVec(1.0, feats_dbl);
352  stats_out->Row(1).Range(0, dim).AddVec2(1.0, feats_dbl);
353  (*stats_out)(0, dim) += 1.0;
354  // it's a sliding buffer; a frame at the back may be
355  // leaving the buffer so we have to subtract that.
356  int32 prev_frame = cur_frame - opts_.cmn_window;
357  if (prev_frame >= 0) {
358  // we need to subtract frame prev_f from the stats.
359  src_->GetFrame(prev_frame, &feats);
360  feats_dbl.CopyFromVec(feats);
361  stats_out->Row(0).Range(0, dim).AddVec(-1.0, feats_dbl);
363  stats_out->Row(1).Range(0, dim).AddVec2(-1.0, feats_dbl);
364  (*stats_out)(0, dim) -= 1.0;
365  }
366  CacheFrame(cur_frame, (*stats_out));
367  }
368 }
virtual int32 Dim() const
Vector< double > temp_feats_dbl_
virtual void GetFrame(int32 frame, VectorBase< BaseFloat > *feat)=0
Gets the feature vector for this frame.
kaldi::int32 int32
void CacheFrame(int32 frame, const MatrixBase< double > &stats)
Cache this frame of stats.
OnlineFeatureInterface * src_
Vector< BaseFloat > temp_feats_
OnlineCmvnOptions opts_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
void GetMostRecentCachedFrame(int32 frame, int32 *cached_frame, MatrixBase< double > *stats)
Get the most recent cached frame of CMVN stats.
virtual int32 NumFramesReady() const
returns the feature dimension.

◆ Dim()

virtual int32 Dim ( ) const
inlinevirtual

◆ FrameShiftInSeconds()

virtual BaseFloat FrameShiftInSeconds ( ) const
inlinevirtual

Implements OnlineFeatureInterface.

Definition at line 332 of file online-feature.h.

332  {
333  return src_->FrameShiftInSeconds();
334  }
OnlineFeatureInterface * src_
virtual BaseFloat FrameShiftInSeconds() const =0

◆ Freeze()

void Freeze ( int32  cur_frame)

Definition at line 454 of file online-feature.cc.

References OnlineCmvn::ComputeStatsForFrame(), OnlineCmvn::Dim(), OnlineCmvn::frozen_state_, OnlineCmvnState::global_cmvn_stats, OnlineCmvn::opts_, OnlineCmvn::orig_state_, OnlineCmvn::SmoothOnlineCmvnStats(), and OnlineCmvnState::speaker_cmvn_stats.

Referenced by OnlineFeaturePipeline::FreezeCmvn().

454  {
455  int32 dim = this->Dim();
456  Matrix<double> stats(2, dim + 1);
457  // get the raw CMVN stats
458  this->ComputeStatsForFrame(cur_frame, &stats);
459  // now smooth them.
462  opts_,
463  &stats);
464  this->frozen_state_ = stats;
465 }
Matrix< double > speaker_cmvn_stats
virtual int32 Dim() const
kaldi::int32 int32
Matrix< double > frozen_state_
OnlineCmvnState orig_state_
OnlineCmvnOptions opts_
void ComputeStatsForFrame(int32 frame, MatrixBase< double > *stats)
Computes the raw CMVN stats for this frame, making use of (and updating if necessary) the cached stat...
static void SmoothOnlineCmvnStats(const MatrixBase< double > &speaker_stats, const MatrixBase< double > &global_stats, const OnlineCmvnOptions &opts, MatrixBase< double > *stats)
Smooth the CMVN stats "stats" (which are stored in the normal format as a 2 x (dim+1) matrix)...
Matrix< double > global_cmvn_stats

◆ GetFrame()

void GetFrame ( int32  frame,
VectorBase< BaseFloat > *  feat 
)
virtual

Gets the feature vector for this frame.

Before calling this for a given frame, it is assumed that you called NumFramesReady() and it returned a number greater than "frame". Otherwise this call will likely crash with an assert failure. This function is not declared const, in case there is some kind of caching going on, but most of the time it shouldn't modify the class.

Implements OnlineFeatureInterface.

Definition at line 421 of file online-feature.cc.

References kaldi::ApplyCmvn(), OnlineCmvn::ComputeStatsForFrame(), MatrixBase< Real >::CopyFromMat(), VectorBase< Real >::Data(), VectorBase< Real >::Dim(), OnlineCmvn::Dim(), kaldi::FakeStatsForSomeDims(), OnlineCmvn::frozen_state_, OnlineFeatureInterface::GetFrame(), OnlineCmvnState::global_cmvn_stats, KALDI_ASSERT, kaldi::kUndefined, OnlineCmvnOptions::normalize_mean, OnlineCmvnOptions::normalize_variance, MatrixBase< Real >::NumRows(), OnlineCmvn::opts_, OnlineCmvn::orig_state_, Matrix< Real >::Resize(), OnlineCmvn::skip_dims_, OnlineCmvn::SmoothOnlineCmvnStats(), OnlineCmvnState::speaker_cmvn_stats, OnlineCmvn::src_, and OnlineCmvn::temp_stats_.

Referenced by main().

422  {
423  src_->GetFrame(frame, feat);
424  KALDI_ASSERT(feat->Dim() == this->Dim());
425  int32 dim = feat->Dim();
426  Matrix<double> &stats(temp_stats_);
427  stats.Resize(2, dim + 1, kUndefined); // Will do nothing if size was correct.
428  if (frozen_state_.NumRows() != 0) { // the CMVN state has been frozen.
429  stats.CopyFromMat(frozen_state_);
430  } else {
431  // first get the raw CMVN stats (this involves caching..)
432  this->ComputeStatsForFrame(frame, &stats);
433  // now smooth them.
436  opts_,
437  &stats);
438  }
439 
440  if (!skip_dims_.empty())
442 
443  // call the function ApplyCmvn declared in ../transform/cmvn.h, which
444  // requires a matrix.
445  // 1 row; num-cols == dim; stride == dim.
446  SubMatrix<BaseFloat> feat_mat(feat->Data(), 1, dim, dim);
447  // the function ApplyCmvn takes a matrix, so form a one-row matrix to give it.
448  if (opts_.normalize_mean)
449  ApplyCmvn(stats, opts_.normalize_variance, &feat_mat);
450  else
452 }
Matrix< double > speaker_cmvn_stats
virtual int32 Dim() const
virtual void GetFrame(int32 frame, VectorBase< BaseFloat > *feat)=0
Gets the feature vector for this frame.
kaldi::int32 int32
Matrix< double > frozen_state_
OnlineFeatureInterface * src_
Matrix< double > temp_stats_
std::vector< int32 > skip_dims_
OnlineCmvnState orig_state_
OnlineCmvnOptions opts_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
void ComputeStatsForFrame(int32 frame, MatrixBase< double > *stats)
Computes the raw CMVN stats for this frame, making use of (and updating if necessary) the cached stat...
static void SmoothOnlineCmvnStats(const MatrixBase< double > &speaker_stats, const MatrixBase< double > &global_stats, const OnlineCmvnOptions &opts, MatrixBase< double > *stats)
Smooth the CMVN stats "stats" (which are stored in the normal format as a 2 x (dim+1) matrix)...
Matrix< double > global_cmvn_stats
void ApplyCmvn(const MatrixBase< double > &stats, bool var_norm, MatrixBase< BaseFloat > *feats)
Apply cepstral mean and variance normalization to a matrix of features.
Definition: cmvn.cc:64
void FakeStatsForSomeDims(const std::vector< int32 > &dims, MatrixBase< double > *stats)
Modify the stats so that for some dimensions (specified in "dims"), we replace them with "fake" stats...
Definition: cmvn.cc:168

◆ GetMostRecentCachedFrame()

void GetMostRecentCachedFrame ( int32  frame,
int32 cached_frame,
MatrixBase< double > *  stats 
)
private

Get the most recent cached frame of CMVN stats.

[If no frames were cached, sets up empty stats for frame zero and returns that].

Definition at line 261 of file online-feature.cc.

References OnlineCmvn::cached_stats_modulo_, OnlineCmvn::cached_stats_ring_, MatrixBase< Real >::CopyFromMat(), OnlineCmvn::InitRingBufferIfNeeded(), KALDI_ASSERT, OnlineCmvnOptions::modulus, rnnlm::n, OnlineCmvn::opts_, OnlineCmvnOptions::ring_buffer_size, and MatrixBase< Real >::SetZero().

Referenced by OnlineCmvn::ComputeStatsForFrame().

263  {
264  KALDI_ASSERT(frame >= 0);
266  // look for a cached frame on a previous frame as close as possible in time
267  // to "frame". Return if we get one.
268  for (int32 t = frame; t >= 0 && t >= frame - opts_.ring_buffer_size; t--) {
269  if (t % opts_.modulus == 0) {
270  // if this frame should be cached in cached_stats_modulo_, then
271  // we'll look there, and we won't go back any further in time.
272  break;
273  }
274  int32 index = t % opts_.ring_buffer_size;
275  if (cached_stats_ring_[index].first == t) {
276  *cached_frame = t;
277  stats->CopyFromMat(cached_stats_ring_[index].second);
278  return;
279  }
280  }
281  int32 n = frame / opts_.modulus;
282  if (n >= cached_stats_modulo_.size()) {
283  if (cached_stats_modulo_.size() == 0) {
284  *cached_frame = -1;
285  stats->SetZero();
286  return;
287  } else {
288  n = static_cast<int32>(cached_stats_modulo_.size() - 1);
289  }
290  }
291  *cached_frame = n * opts_.modulus;
293  stats->CopyFromMat(*(cached_stats_modulo_[n]));
294 }
kaldi::int32 int32
void InitRingBufferIfNeeded()
Initialize ring buffer for caching stats.
std::vector< std::pair< int32, Matrix< double > > > cached_stats_ring_
struct rnnlm::@11::@12 n
OnlineCmvnOptions opts_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
std::vector< Matrix< double > * > cached_stats_modulo_

◆ GetState()

void GetState ( int32  cur_frame,
OnlineCmvnState cmvn_state 
)

Definition at line 467 of file online-feature.cc.

References VectorBase< Real >::CopyFromVec(), OnlineCmvn::Dim(), OnlineCmvnState::frozen_state, OnlineCmvn::frozen_state_, OnlineFeatureInterface::GetFrame(), MatrixBase< Real >::NumRows(), OnlineCmvn::orig_state_, Matrix< Real >::Resize(), MatrixBase< Real >::Row(), OnlineCmvnState::speaker_cmvn_stats, and OnlineCmvn::src_.

Referenced by OnlineFeaturePipeline::GetCmvnState(), OnlineNnet2FeaturePipeline::GetCmvnState(), and main().

468  {
469  *state_out = this->orig_state_;
470  { // This block updates state_out->speaker_cmvn_stats
471  int32 dim = this->Dim();
472  if (state_out->speaker_cmvn_stats.NumRows() == 0)
473  state_out->speaker_cmvn_stats.Resize(2, dim + 1);
474  Vector<BaseFloat> feat(dim);
475  Vector<double> feat_dbl(dim);
476  for (int32 t = 0; t <= cur_frame; t++) {
477  src_->GetFrame(t, &feat);
478  feat_dbl.CopyFromVec(feat);
479  state_out->speaker_cmvn_stats(0, dim) += 1.0;
480  state_out->speaker_cmvn_stats.Row(0).Range(0, dim).AddVec(1.0, feat_dbl);
481  state_out->speaker_cmvn_stats.Row(1).Range(0, dim).AddVec2(1.0, feat_dbl);
482  }
483  }
484  // Store any frozen state (the effect of the user possibly
485  // having called Freeze().
486  state_out->frozen_state = frozen_state_;
487 }
virtual int32 Dim() const
virtual void GetFrame(int32 frame, VectorBase< BaseFloat > *feat)=0
Gets the feature vector for this frame.
kaldi::int32 int32
Matrix< double > frozen_state_
OnlineFeatureInterface * src_
OnlineCmvnState orig_state_

◆ InitRingBufferIfNeeded()

void InitRingBufferIfNeeded ( )
inlineprivate

Initialize ring buffer for caching stats.

Definition at line 297 of file online-feature.cc.

References OnlineCmvn::cached_stats_ring_, OnlineCmvn::Dim(), OnlineCmvn::opts_, and OnlineCmvnOptions::ring_buffer_size.

Referenced by OnlineCmvn::CacheFrame(), and OnlineCmvn::GetMostRecentCachedFrame().

297  {
298  if (cached_stats_ring_.empty() && opts_.ring_buffer_size > 0) {
299  Matrix<double> temp(2, this->Dim() + 1);
301  std::pair<int32, Matrix<double> >(-1, temp));
302  }
303 }
virtual int32 Dim() const
kaldi::int32 int32
std::vector< std::pair< int32, Matrix< double > > > cached_stats_ring_
OnlineCmvnOptions opts_

◆ IsLastFrame()

virtual bool IsLastFrame ( int32  frame) const
inlinevirtual

Returns true if this is the last frame.

Frame indices are zero-based, so the first frame is zero. IsLastFrame(-1) will return false, unless the file is empty (which is a case that I'm not sure all the code will handle, so be careful). This function may return false for some frame if we haven't yet decided to terminate decoding, but later true if we decide to terminate decoding. This function exists mainly to correctly handle end effects in feature extraction, and is not a mechanism to determine how many frames are in the decodable object (as it used to be, and for backward compatibility, still is, in the Decodable interface).

Implements OnlineFeatureInterface.

Definition at line 329 of file online-feature.h.

329  {
330  return src_->IsLastFrame(frame);
331  }
OnlineFeatureInterface * src_
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.

◆ NumFramesReady()

virtual int32 NumFramesReady ( ) const
inlinevirtual

returns the feature dimension.

Returns the total number of frames, since the start of the utterance, that are now available. In an online-decoding context, this will likely increase with time as more data becomes available.

Implements OnlineFeatureInterface.

Definition at line 337 of file online-feature.h.

Referenced by OnlineCmvn::ComputeStatsForFrame(), OnlineFeaturePipeline::FreezeCmvn(), OnlineFeaturePipeline::GetCmvnState(), OnlineNnet2FeaturePipeline::GetCmvnState(), and OnlineSpliceFrames::GetFrame().

337 { return src_->NumFramesReady(); }
OnlineFeatureInterface * src_
virtual int32 NumFramesReady() const =0
returns the feature dimension.

◆ SetState()

void SetState ( const OnlineCmvnState cmvn_state)

Definition at line 489 of file online-feature.cc.

References OnlineCmvn::cached_stats_modulo_, OnlineCmvnState::frozen_state, OnlineCmvn::frozen_state_, KALDI_ASSERT, and OnlineCmvn::orig_state_.

Referenced by OnlineCmvn::OnlineCmvn(), OnlineIvectorFeature::SetAdaptationState(), OnlineFeaturePipeline::SetCmvnState(), and OnlineNnet2FeaturePipeline::SetCmvnState().

489  {
491  "You cannot call SetState() after processing data.");
492  orig_state_ = cmvn_state;
493  frozen_state_ = cmvn_state.frozen_state;
494 }
Matrix< double > frozen_state_
OnlineCmvnState orig_state_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
std::vector< Matrix< double > * > cached_stats_modulo_

◆ SmoothOnlineCmvnStats()

void SmoothOnlineCmvnStats ( const MatrixBase< double > &  speaker_stats,
const MatrixBase< double > &  global_stats,
const OnlineCmvnOptions opts,
MatrixBase< double > *  stats 
)
staticprivate

Smooth the CMVN stats "stats" (which are stored in the normal format as a 2 x (dim+1) matrix), by possibly adding some stats from "global_stats" and/or "speaker_stats", controlled by the config.

The best way to understand the smoothing rule we use is just to look at the code.

Definition at line 372 of file online-feature.cc.

References MatrixBase< Real >::AddMat(), OnlineCmvnOptions::cmn_window, OnlineCmvnOptions::global_frames, KALDI_ASSERT, KALDI_ERR, OnlineCmvnOptions::normalize_variance, MatrixBase< Real >::NumCols(), MatrixBase< Real >::NumRows(), MatrixBase< Real >::RowRange(), and OnlineCmvnOptions::speaker_frames.

Referenced by OnlineCmvn::Freeze(), and OnlineCmvn::GetFrame().

375  {
376  if (speaker_stats.NumRows() == 2 && !opts.normalize_variance) {
377  // this is just for efficiency: don't operate on the variance if it's not
378  // needed.
379  int32 cols = speaker_stats.NumCols(); // dim + 1
380  SubMatrix<double> stats_temp(*stats, 0, 1, 0, cols);
381  SmoothOnlineCmvnStats(speaker_stats.RowRange(0, 1),
382  global_stats.RowRange(0, 1),
383  opts, &stats_temp);
384  return;
385  }
386  int32 dim = stats->NumCols() - 1;
387  double cur_count = (*stats)(0, dim);
388  // If count exceeded cmn_window it would be an error in how "window_stats"
389  // was accumulated.
390  KALDI_ASSERT(cur_count <= 1.001 * opts.cmn_window);
391  if (cur_count >= opts.cmn_window)
392  return;
393  if (speaker_stats.NumRows() != 0) { // if we have speaker stats..
394  double count_from_speaker = opts.cmn_window - cur_count,
395  speaker_count = speaker_stats(0, dim);
396  if (count_from_speaker > opts.speaker_frames)
397  count_from_speaker = opts.speaker_frames;
398  if (count_from_speaker > speaker_count)
399  count_from_speaker = speaker_count;
400  if (count_from_speaker > 0.0)
401  stats->AddMat(count_from_speaker / speaker_count,
402  speaker_stats);
403  cur_count = (*stats)(0, dim);
404  }
405  if (cur_count >= opts.cmn_window)
406  return;
407  if (global_stats.NumRows() != 0) {
408  double count_from_global = opts.cmn_window - cur_count,
409  global_count = global_stats(0, dim);
410  KALDI_ASSERT(global_count > 0.0);
411  if (count_from_global > opts.global_frames)
412  count_from_global = opts.global_frames;
413  if (count_from_global > 0.0)
414  stats->AddMat(count_from_global / global_count,
415  global_stats);
416  } else {
417  KALDI_ERR << "Global CMN stats are required";
418  }
419 }
kaldi::int32 int32
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static void SmoothOnlineCmvnStats(const MatrixBase< double > &speaker_stats, const MatrixBase< double > &global_stats, const OnlineCmvnOptions &opts, MatrixBase< double > *stats)
Smooth the CMVN stats "stats" (which are stored in the normal format as a 2 x (dim+1) matrix)...

Member Data Documentation

◆ cached_stats_modulo_

std::vector<Matrix<double>*> cached_stats_modulo_
private

◆ cached_stats_ring_

std::vector<std::pair<int32, Matrix<double> > > cached_stats_ring_
private

◆ frozen_state_

Matrix<double> frozen_state_
private

◆ opts_

◆ orig_state_

◆ skip_dims_

std::vector<int32> skip_dims_
private

Definition at line 420 of file online-feature.h.

Referenced by OnlineCmvn::GetFrame(), and OnlineCmvn::OnlineCmvn().

◆ src_

◆ temp_feats_

Vector<BaseFloat> temp_feats_
private

Definition at line 439 of file online-feature.h.

Referenced by OnlineCmvn::ComputeStatsForFrame().

◆ temp_feats_dbl_

Vector<double> temp_feats_dbl_
private

Definition at line 440 of file online-feature.h.

Referenced by OnlineCmvn::ComputeStatsForFrame().

◆ temp_stats_

Matrix<double> temp_stats_
private

Definition at line 438 of file online-feature.h.

Referenced by OnlineCmvn::GetFrame().


The documentation for this class was generated from the following files: