DecodableAmDiagGmmUnmapped Class Reference

DecodableAmDiagGmmUnmapped is a decodable object that takes indices that correspond to pdf-id's plus one. More...

#include <decodable-am-diag-gmm.h>

Inheritance diagram for DecodableAmDiagGmmUnmapped:
Collaboration diagram for DecodableAmDiagGmmUnmapped:

Classes

struct  LikelihoodCacheRecord
 Defines a cache record for a state. More...
 

Public Member Functions

 DecodableAmDiagGmmUnmapped (const AmDiagGmm &am, const Matrix< BaseFloat > &feats, BaseFloat log_sum_exp_prune=-1.0)
 If you set log_sum_exp_prune to a value greater than 0 it will prune in the LogSumExp operation (larger = more exact); I suggest 5. More...
 
virtual BaseFloat LogLikelihood (int32 frame, int32 state_index)
 Returns the log likelihood, which will be negated in the decoder. More...
 
virtual int32 NumFramesReady () const
 The call NumFramesReady() will return the number of frames currently available for this decodable object. More...
 
virtual int32 NumIndices () const
 Returns the number of states in the acoustic model (they will be indexed one-based, i.e. More...
 
virtual bool IsLastFrame (int32 frame) const
 Returns true if this is the last frame. More...
 
- Public Member Functions inherited from DecodableInterface
virtual ~DecodableInterface ()
 

Protected Member Functions

void ResetLogLikeCache ()
 
virtual BaseFloat LogLikelihoodZeroBased (int32 frame, int32 state_index)
 

Protected Attributes

const AmDiagGmmacoustic_model_
 
const Matrix< BaseFloat > & feature_matrix_
 
int32 previous_frame_
 
BaseFloat log_sum_exp_prune_
 
std::vector< LikelihoodCacheRecordlog_like_cache_
 

Private Member Functions

 KALDI_DISALLOW_COPY_AND_ASSIGN (DecodableAmDiagGmmUnmapped)
 

Private Attributes

Vector< BaseFloatdata_squared_
 Cache for fast likelihood calculation. More...
 

Detailed Description

DecodableAmDiagGmmUnmapped is a decodable object that takes indices that correspond to pdf-id's plus one.

This may be used in future in a decoder that doesn't need to output alignments, if we create FSTs that have the pdf-ids plus one as the input labels (we couldn't use the pdf-ids themselves because they start from zero, and the graph might have epsilon transitions).

Definition at line 45 of file decodable-am-diag-gmm.h.

Constructor & Destructor Documentation

◆ DecodableAmDiagGmmUnmapped()

DecodableAmDiagGmmUnmapped ( const AmDiagGmm am,
const Matrix< BaseFloat > &  feats,
BaseFloat  log_sum_exp_prune = -1.0 
)
inline

If you set log_sum_exp_prune to a value greater than 0 it will prune in the LogSumExp operation (larger = more exact); I suggest 5.

This is advisable if it's spending a long time doing exp operations.

Definition at line 51 of file decodable-am-diag-gmm.h.

References DecodableAmDiagGmmUnmapped::ResetLogLikeCache().

53  :
54  acoustic_model_(am), feature_matrix_(feats),
55  previous_frame_(-1), log_sum_exp_prune_(log_sum_exp_prune),
56  data_squared_(feats.NumCols()) {
58  }
MatrixIndexT NumCols() const
Returns number of columns (or zero for empty matrix).
Definition: kaldi-matrix.h:67
Vector< BaseFloat > data_squared_
Cache for fast likelihood calculation.
const Matrix< BaseFloat > & feature_matrix_

Member Function Documentation

◆ IsLastFrame()

virtual bool IsLastFrame ( int32  frame) const
inlinevirtual

Returns true if this is the last frame.

Frames are zero-based, so the first frame is zero. IsLastFrame(-1) will return false, unless the file is empty (which is a case that I'm not sure all the code will handle, so be careful). Caution: the behavior of this function in an online setting is being changed somewhat. In future it may return false in cases where we haven't yet decided to terminate decoding, but later true if we decide to terminate decoding. The plan in future is to rely more on NumFramesReady(), and in future, IsLastFrame() would always return false in an online-decoding setting, and would only return true in a decoding-from-matrix setting where we want to allow the last delta or LDA features to be flushed out for compatibility with the baseline setup.

Implements DecodableInterface.

Definition at line 70 of file decodable-am-diag-gmm.h.

References KALDI_ASSERT, DecodableAmDiagGmmUnmapped::LogLikelihoodZeroBased(), DecodableAmDiagGmmUnmapped::NumFramesReady(), and DecodableAmDiagGmmUnmapped::ResetLogLikeCache().

70  {
71  KALDI_ASSERT(frame < NumFramesReady());
72  return (frame == NumFramesReady() - 1);
73  }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
virtual int32 NumFramesReady() const
The call NumFramesReady() will return the number of frames currently available for this decodable obj...

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( DecodableAmDiagGmmUnmapped  )
private

◆ LogLikelihood()

virtual BaseFloat LogLikelihood ( int32  frame,
int32  index 
)
inlinevirtual

Returns the log likelihood, which will be negated in the decoder.

The "frame" starts from zero. You should verify that NumFramesReady() > frame before calling this.

Implements DecodableInterface.

Reimplemented in DecodableAmDiagGmmScaled, DecodableAmDiagGmm, DecodableAmDiagGmmRegtreeMllr, and DecodableAmDiagGmmRegtreeFmllr.

Definition at line 62 of file decodable-am-diag-gmm.h.

References DecodableAmDiagGmmUnmapped::LogLikelihoodZeroBased().

62  {
63  return LogLikelihoodZeroBased(frame, state_index - 1);
64  }
virtual BaseFloat LogLikelihoodZeroBased(int32 frame, int32 state_index)

◆ LogLikelihoodZeroBased()

BaseFloat LogLikelihoodZeroBased ( int32  frame,
int32  state_index 
)
protectedvirtual

Reimplemented in DecodableAmDiagGmmRegtreeMllr, and DecodableAmDiagGmmRegtreeFmllr.

Definition at line 28 of file decodable-am-diag-gmm.cc.

References DecodableAmDiagGmmUnmapped::acoustic_model_, VectorBase< Real >::AddMatVec(), DecodableAmDiagGmmUnmapped::data_squared_, VectorBase< Real >::Dim(), DiagGmm::Dim(), DecodableAmDiagGmmUnmapped::feature_matrix_, DiagGmm::gconsts(), AmDiagGmm::GetPdf(), DiagGmm::inv_vars(), KALDI_ASSERT, KALDI_ERR, KALDI_ISINF, KALDI_ISNAN, kaldi::kNoTrans, DecodableAmDiagGmmUnmapped::log_like_cache_, DecodableAmDiagGmmUnmapped::log_sum_exp_prune_, DiagGmm::means_invvars(), DecodableAmDiagGmmUnmapped::NumFramesReady(), DecodableAmDiagGmmUnmapped::NumIndices(), DecodableAmDiagGmmUnmapped::previous_frame_, MatrixBase< Real >::Row(), and DiagGmm::valid_gconsts().

Referenced by DecodableAmDiagGmmUnmapped::IsLastFrame(), DecodableAmDiagGmmUnmapped::LogLikelihood(), DecodableAmDiagGmm::LogLikelihood(), and DecodableAmDiagGmmScaled::LogLikelihood().

29  {
30  KALDI_ASSERT(static_cast<size_t>(frame) <
31  static_cast<size_t>(NumFramesReady()));
32  KALDI_ASSERT(static_cast<size_t>(state) < static_cast<size_t>(NumIndices()) &&
33  "Likely graph/model mismatch, e.g. using wrong HCLG.fst");
34 
35  if (log_like_cache_[state].hit_time == frame) {
36  return log_like_cache_[state].log_like; // return cached value, if found
37  }
38 
39  if (frame != previous_frame_) { // cache the squared stats.
40  data_squared_.CopyFromVec(feature_matrix_.Row(frame));
41  data_squared_.ApplyPow(2.0);
42  previous_frame_ = frame;
43  }
44 
45  const DiagGmm &pdf = acoustic_model_.GetPdf(state);
46  const VectorBase<BaseFloat> &data = feature_matrix_.Row(frame);
47 
48  // check if everything is in order
49  if (pdf.Dim() != data.Dim()) {
50  KALDI_ERR << "Dim mismatch: data dim = " << data.Dim()
51  << " vs. model dim = " << pdf.Dim();
52  }
53  if (!pdf.valid_gconsts()) {
54  KALDI_ERR << "State " << (state) << ": Must call ComputeGconsts() "
55  "before computing likelihood.";
56  }
57 
58  Vector<BaseFloat> loglikes(pdf.gconsts()); // need to recreate for each pdf
59  // loglikes += means * inv(vars) * data.
60  loglikes.AddMatVec(1.0, pdf.means_invvars(), kNoTrans, data, 1.0);
61  // loglikes += -0.5 * inv(vars) * data_sq.
62  loglikes.AddMatVec(-0.5, pdf.inv_vars(), kNoTrans, data_squared_, 1.0);
63 
64  BaseFloat log_sum = loglikes.LogSumExp(log_sum_exp_prune_);
65  if (KALDI_ISNAN(log_sum) || KALDI_ISINF(log_sum))
66  KALDI_ERR << "Invalid answer (overflow or invalid variances/features?)";
67 
68  log_like_cache_[state].log_like = log_sum;
69  log_like_cache_[state].hit_time = frame;
70 
71  return log_sum;
72 }
virtual int32 NumIndices() const
Returns the number of states in the acoustic model (they will be indexed one-based, i.e.
#define KALDI_ISINF
Definition: kaldi-math.h:73
Vector< BaseFloat > data_squared_
Cache for fast likelihood calculation.
float BaseFloat
Definition: kaldi-types.h:29
const SubVector< Real > Row(MatrixIndexT i) const
Return specific row of matrix [const].
Definition: kaldi-matrix.h:188
#define KALDI_ERR
Definition: kaldi-error.h:147
const Matrix< BaseFloat > & feature_matrix_
std::vector< LikelihoodCacheRecord > log_like_cache_
DiagGmm & GetPdf(int32 pdf_index)
Accessors.
Definition: am-diag-gmm.h:119
#define KALDI_ISNAN
Definition: kaldi-math.h:72
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
virtual int32 NumFramesReady() const
The call NumFramesReady() will return the number of frames currently available for this decodable obj...

◆ NumFramesReady()

virtual int32 NumFramesReady ( ) const
inlinevirtual

The call NumFramesReady() will return the number of frames currently available for this decodable object.

This is for use in setups where you don't want the decoder to block while waiting for input. This is newly added as of Jan 2014, and I hope, going forward, to rely on this mechanism more than IsLastFrame to know when to stop decoding.

Reimplemented from DecodableInterface.

Reimplemented in DecodableAmDiagGmmRegtreeMllr, and DecodableAmDiagGmmRegtreeFmllr.

Definition at line 65 of file decodable-am-diag-gmm.h.

References DecodableAmDiagGmmUnmapped::feature_matrix_, and MatrixBase< Real >::NumRows().

Referenced by DecodableAmDiagGmmUnmapped::IsLastFrame(), DecodableAmDiagGmmUnmapped::LogLikelihoodZeroBased(), and main().

65 { return feature_matrix_.NumRows(); }
const Matrix< BaseFloat > & feature_matrix_
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64

◆ NumIndices()

virtual int32 NumIndices ( ) const
inlinevirtual

Returns the number of states in the acoustic model (they will be indexed one-based, i.e.

from 1 to NumIndices(); this is for compatibility with OpenFst).

Implements DecodableInterface.

Reimplemented in DecodableAmDiagGmmScaled, DecodableAmDiagGmm, DecodableAmDiagGmmRegtreeMllr, and DecodableAmDiagGmmRegtreeFmllr.

Definition at line 68 of file decodable-am-diag-gmm.h.

References DecodableAmDiagGmmUnmapped::acoustic_model_, and AmDiagGmm::NumPdfs().

Referenced by DecodableAmDiagGmmUnmapped::LogLikelihoodZeroBased().

68 { return acoustic_model_.NumPdfs(); }
int32 NumPdfs() const
Definition: am-diag-gmm.h:82

◆ ResetLogLikeCache()

void ResetLogLikeCache ( )
protected

Definition at line 74 of file decodable-am-diag-gmm.cc.

References DecodableAmDiagGmmUnmapped::acoustic_model_, DecodableAmDiagGmmUnmapped::log_like_cache_, and AmDiagGmm::NumPdfs().

Referenced by DecodableAmDiagGmmUnmapped::DecodableAmDiagGmmUnmapped(), DecodableAmDiagGmmRegtreeMllr::InitCache(), and DecodableAmDiagGmmUnmapped::IsLastFrame().

74  {
75  if (static_cast<int32>(log_like_cache_.size()) != acoustic_model_.NumPdfs()) {
77  }
78  vector<LikelihoodCacheRecord>::iterator it = log_like_cache_.begin(),
79  end = log_like_cache_.end();
80  for (; it != end; ++it) { it->hit_time = -1; }
81 }
std::vector< LikelihoodCacheRecord > log_like_cache_
int32 NumPdfs() const
Definition: am-diag-gmm.h:82

Member Data Documentation

◆ acoustic_model_

◆ data_squared_

Vector<BaseFloat> data_squared_
private

Cache for fast likelihood calculation.

Definition at line 91 of file decodable-am-diag-gmm.h.

Referenced by DecodableAmDiagGmmUnmapped::LogLikelihoodZeroBased(), and DecodableAmDiagGmmRegtreeMllr::LogLikelihoodZeroBased().

◆ feature_matrix_

◆ log_like_cache_

◆ log_sum_exp_prune_

◆ previous_frame_


The documentation for this class was generated from the following files: