SingleUtteranceGmmDecoder Class Reference

You will instantiate this class when you want to decode a single utterance using the online-decoding setup. More...

#include <online-gmm-decoding.h>

Collaboration diagram for SingleUtteranceGmmDecoder:

Public Member Functions

 SingleUtteranceGmmDecoder (const OnlineGmmDecodingConfig &config, const OnlineGmmDecodingModels &models, const OnlineFeaturePipeline &feature_prototype, const fst::Fst< fst::StdArc > &fst, const OnlineGmmAdaptationState &adaptation_state)
 
OnlineFeaturePipelineFeaturePipeline ()
 
void AdvanceDecoding ()
 advance the decoding as far as we can. More...
 
void FinalizeDecoding ()
 Finalize the decoding. More...
 
bool HaveTransform () const
 Returns true if we already have an fMLLR transform. More...
 
void EstimateFmllr (bool end_of_utterance)
 Estimate the [basis-]fMLLR transform and apply it to the features. More...
 
void GetAdaptationState (OnlineGmmAdaptationState *adaptation_state) const
 
void GetLattice (bool rescore_if_needed, bool end_of_utterance, CompactLattice *clat) const
 Gets the lattice. More...
 
void GetBestPath (bool end_of_utterance, Lattice *best_path) const
 Outputs an FST corresponding to the single best path through the current lattice. More...
 
BaseFloat FinalRelativeCost ()
 This function outputs to "final_relative_cost", if non-NULL, a number >= 0 that will be close to zero if the final-probs were close to the best probs active on the final frame. More...
 
bool EndpointDetected (const OnlineEndpointConfig &config)
 This function calls EndpointDetected from online-endpoint.h, with the required arguments. More...
 
 ~SingleUtteranceGmmDecoder ()
 

Private Member Functions

bool GetGaussianPosteriors (bool end_of_utterance, GaussPost *gpost)
 
bool RescoringIsNeeded () const
 Returns true if doing a lattice rescoring pass would have any point, i.e. More...
 

Private Attributes

OnlineGmmDecodingConfig config_
 
std::vector< int32silence_phones_
 
const OnlineGmmDecodingModelsmodels_
 
OnlineFeaturePipelinefeature_pipeline_
 
const OnlineGmmAdaptationStateorig_adaptation_state_
 
OnlineGmmAdaptationState adaptation_state_
 
LatticeFasterOnlineDecoder decoder_
 

Detailed Description

You will instantiate this class when you want to decode a single utterance using the online-decoding setup.

This is an alternative to manually putting things together yourself.

Definition at line 216 of file online-gmm-decoding.h.

Constructor & Destructor Documentation

◆ SingleUtteranceGmmDecoder()

SingleUtteranceGmmDecoder ( const OnlineGmmDecodingConfig config,
const OnlineGmmDecodingModels models,
const OnlineFeaturePipeline feature_prototype,
const fst::Fst< fst::StdArc > &  fst,
const OnlineGmmAdaptationState adaptation_state 
)

Definition at line 48 of file online-gmm-decoding.cc.

References SingleUtteranceGmmDecoder::adaptation_state_, SingleUtteranceGmmDecoder::config_, SingleUtteranceGmmDecoder::decoder_, SingleUtteranceGmmDecoder::feature_pipeline_, LatticeFasterDecoderTpl< FST, Token >::InitDecoding(), KALDI_ERR, OnlineFeaturePipeline::SetTransform(), OnlineGmmDecodingConfig::silence_phones, SingleUtteranceGmmDecoder::silence_phones_, kaldi::SortAndUniq(), kaldi::SplitStringToIntegers(), and OnlineGmmAdaptationState::transform.

53  :
54  config_(config), models_(models),
55  feature_pipeline_(feature_prototype.New()),
56  orig_adaptation_state_(adaptation_state),
57  adaptation_state_(adaptation_state),
58  decoder_(fst, config.faster_decoder_opts) {
61  KALDI_ERR << "Bad --silence-phones option '"
62  << config_.silence_phones << "'";
66 }
For an extended explanation of the framework of which grammar-fsts are a part, please see Support for...
Definition: graph.dox:21
bool SplitStringToIntegers(const std::string &full, const char *delim, bool omit_empty_strings, std::vector< I > *out)
Split a string (e.g.
Definition: text-utils.h:68
OnlineGmmAdaptationState adaptation_state_
void SortAndUniq(std::vector< T > *vec)
Sorts and uniq&#39;s (removes duplicates) from a vector.
Definition: stl-utils.h:39
void InitDecoding()
InitDecoding initializes the decoding, and should only be used if you intend to call AdvanceDecoding(...
#define KALDI_ERR
Definition: kaldi-error.h:147
OnlineFeaturePipeline * feature_pipeline_
void SetTransform(const MatrixBase< BaseFloat > &transform)
const OnlineGmmDecodingModels & models_
const OnlineGmmAdaptationState & orig_adaptation_state_
LatticeFasterOnlineDecoder decoder_

◆ ~SingleUtteranceGmmDecoder()

Definition at line 305 of file online-gmm-decoding.cc.

References SingleUtteranceGmmDecoder::feature_pipeline_.

305  {
306  delete feature_pipeline_;
307 }
OnlineFeaturePipeline * feature_pipeline_

Member Function Documentation

◆ AdvanceDecoding()

void AdvanceDecoding ( )

advance the decoding as far as we can.

May also estimate fMLLR after advancing the decoding, depending on the configuration values in config_.adaptation_policy_opts. [Note: we expect the user will also call EstimateFmllr() at utterance end, which should generally improve the quality of the estimated transforms, although we don't rely on this].

Definition at line 69 of file online-gmm-decoding.cc.

References OnlineGmmDecodingConfig::acoustic_scale, OnlineGmmDecodingConfig::adaptation_policy_opts, LatticeFasterDecoderTpl< FST, Token >::AdvanceDecoding(), SingleUtteranceGmmDecoder::config_, SingleUtteranceGmmDecoder::decoder_, OnlineGmmDecodingAdaptationPolicyConfig::DoAdapt(), SingleUtteranceGmmDecoder::EstimateFmllr(), SingleUtteranceGmmDecoder::feature_pipeline_, OnlineFeaturePipeline::FrameShiftInSeconds(), OnlineGmmDecodingModels::GetModel(), OnlineGmmDecodingModels::GetOnlineAlignmentModel(), OnlineGmmDecodingModels::GetTransitionModel(), SingleUtteranceGmmDecoder::HaveTransform(), SingleUtteranceGmmDecoder::models_, LatticeFasterDecoderTpl< FST, Token >::NumFramesDecoded(), MatrixBase< Real >::NumRows(), SingleUtteranceGmmDecoder::orig_adaptation_state_, and OnlineGmmAdaptationState::transform.

69  {
70 
71  const AmDiagGmm &am_gmm = (HaveTransform() ? models_.GetModel() :
73 
74  // The decodable object is lightweight, we lose nothing
75  // from constructing it each time we want to decode more of the
76  // input.
77  DecodableDiagGmmScaledOnline decodable(am_gmm,
81 
82  int32 old_frames = decoder_.NumFramesDecoded();
83 
84  // This will decode as many frames as are currently available.
85  decoder_.AdvanceDecoding(&decodable);
86 
87 
88  { // possibly estimate fMLLR.
89  int32 new_frames = decoder_.NumFramesDecoded();
91  // if the original adaptation state (at utterance-start) had no transform,
92  // then this means it's the first utt of the speaker... even if not, if we
93  // don't have a transform it probably makes sense to treat it as the 1st utt
94  // of the speaker, i.e. to do fMLLR adaptation sooner.
95  bool is_first_utterance_of_speaker =
97  bool end_of_utterance = false;
98  if (config_.adaptation_policy_opts.DoAdapt(old_frames * frame_shift,
99  new_frames * frame_shift,
100  is_first_utterance_of_speaker))
101  this->EstimateFmllr(end_of_utterance);
102  }
103 }
const AmDiagGmm & GetModel() const
kaldi::int32 int32
const AmDiagGmm & GetOnlineAlignmentModel() const
float BaseFloat
Definition: kaldi-types.h:29
OnlineFeaturePipeline * feature_pipeline_
bool DoAdapt(BaseFloat chunk_begin_secs, BaseFloat chunk_end_secs, bool is_first_utterance) const
This function returns true if we are scheduled to re-estimate fMLLR somewhere in the interval [ chunk...
OnlineGmmDecodingAdaptationPolicyConfig adaptation_policy_opts
void EstimateFmllr(bool end_of_utterance)
Estimate the [basis-]fMLLR transform and apply it to the features.
const TransitionModel & GetTransitionModel() const
void AdvanceDecoding(DecodableInterface *decodable, int32 max_num_frames=-1)
This will decode until there are no more frames ready in the decodable object.
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
const OnlineGmmDecodingModels & models_
const OnlineGmmAdaptationState & orig_adaptation_state_
LatticeFasterOnlineDecoder decoder_
bool HaveTransform() const
Returns true if we already have an fMLLR transform.

◆ EndpointDetected()

bool EndpointDetected ( const OnlineEndpointConfig config)

This function calls EndpointDetected from online-endpoint.h, with the required arguments.

Definition at line 310 of file online-gmm-decoding.cc.

References SingleUtteranceGmmDecoder::decoder_, kaldi::EndpointDetected(), SingleUtteranceGmmDecoder::feature_pipeline_, OnlineFeaturePipeline::FrameShiftInSeconds(), OnlineGmmDecodingModels::GetTransitionModel(), and SingleUtteranceGmmDecoder::models_.

311  {
312  const TransitionModel &tmodel = models_.GetTransitionModel();
313  return kaldi::EndpointDetected(config, tmodel,
315  decoder_);
316 }
bool EndpointDetected(const OnlineEndpointConfig &config, int32 num_frames_decoded, int32 trailing_silence_frames, BaseFloat frame_shift_in_seconds, BaseFloat final_relative_cost)
This function returns true if this set of endpointing rules thinks we should terminate decoding...
OnlineFeaturePipeline * feature_pipeline_
const TransitionModel & GetTransitionModel() const
const OnlineGmmDecodingModels & models_
LatticeFasterOnlineDecoder decoder_

◆ EstimateFmllr()

void EstimateFmllr ( bool  end_of_utterance)

Estimate the [basis-]fMLLR transform and apply it to the features.

This will get used if you call RescoreLattice() or if you just continue decoding; however to get it applied retroactively you'd have to call RescoreLattice(). "end_of_utterance" just affects how we interpret the final-probs in the lattice. This should generally be true if you think you've reached the end of the grammar, and false otherwise.

Definition at line 208 of file online-gmm-decoding.cc.

References FmllrDiagGmmAccs::AccumulateFromPosteriors(), SingleUtteranceGmmDecoder::adaptation_state_, OnlineGmmDecodingConfig::basis_opts, AffineXformStats::beta_, BasisFmllrEstimate::ComputeTransform(), SingleUtteranceGmmDecoder::config_, SingleUtteranceGmmDecoder::decoder_, AffineXformStats::Dim(), VectorBase< Real >::Dim(), BasisFmllrEstimate::Dim(), OnlineFeaturePipeline::Dim(), SingleUtteranceGmmDecoder::feature_pipeline_, OnlineFeaturePipeline::FreezeCmvn(), OnlineFeaturePipeline::GetAsMatrix(), OnlineGmmDecodingModels::GetFmllrBasis(), OnlineFeaturePipeline::GetFrame(), SingleUtteranceGmmDecoder::GetGaussianPosteriors(), OnlineGmmDecodingModels::GetModel(), AmDiagGmm::GetPdf(), kaldi::GetVerboseLevel(), rnnlm::i, FmllrDiagGmmAccs::Init(), rnnlm::j, KALDI_ERR, KALDI_VLOG, KALDI_WARN, SingleUtteranceGmmDecoder::models_, LatticeFasterDecoderTpl< FST, Token >::NumFramesDecoded(), MatrixBase< Real >::NumRows(), SingleUtteranceGmmDecoder::orig_adaptation_state_, OnlineFeaturePipeline::SetTransform(), OnlineGmmAdaptationState::spk_stats, and OnlineGmmAdaptationState::transform.

Referenced by SingleUtteranceGmmDecoder::AdvanceDecoding().

208  {
209  if (decoder_.NumFramesDecoded() == 0) {
210  KALDI_WARN << "You have decoded no data so cannot estimate fMLLR.";
211  }
212 
213  if (GetVerboseLevel() >= 2) {
214  Matrix<BaseFloat> feats;
216  KALDI_VLOG(2) << "Features are " << feats;
217  }
218 
219 
220  GaussPost gpost;
221  GetGaussianPosteriors(end_of_utterance, &gpost);
222 
223  FmllrDiagGmmAccs &spk_stats = adaptation_state_.spk_stats;
224 
225  if (spk_stats.beta_ !=
227  // This could happen if the user called EstimateFmllr() twice on the
228  // same utterance... we don't want to count any stats twice so we
229  // have to reset the stats to what they were before this utterance
230  // (possibly empty).
231  spk_stats = orig_adaptation_state_.spk_stats;
232  }
233 
234  int32 dim = feature_pipeline_->Dim();
235  if (spk_stats.Dim() == 0)
236  spk_stats.Init(dim);
237 
238  Matrix<BaseFloat> empty_transform;
239  feature_pipeline_->SetTransform(empty_transform);
240  Vector<BaseFloat> feat(dim);
241 
242  if (adaptation_state_.transform.NumRows() == 0) {
243  // If this is the first time we're estimating fMLLR, freeze the CMVN to its
244  // current value. It doesn't matter too much what value this is, since we
245  // have already computed the Gaussian-level alignments (it may have a small
246  // effect if the basis is very small and doesn't include an offset as part
247  // of the transform).
249  }
250 
251  // GetModel() returns the model to be used for estimating
252  // transforms.
253  const AmDiagGmm &am_gmm = models_.GetModel();
254 
255  for (size_t i = 0; i < gpost.size(); i++) {
256  feature_pipeline_->GetFrame(i, &feat);
257  for (size_t j = 0; j < gpost[i].size(); j++) {
258  int32 pdf_id = gpost[i][j].first; // caution: this gpost has pdf-id
259  // instead of transition-id, which is
260  // unusual.
261  const Vector<BaseFloat> &posterior(gpost[i][j].second);
262  spk_stats.AccumulateFromPosteriors(am_gmm.GetPdf(pdf_id),
263  feat, posterior);
264  }
265  }
266 
267  const BasisFmllrEstimate &basis = models_.GetFmllrBasis();
268  if (basis.Dim() == 0)
269  KALDI_ERR << "In order to estimate fMLLR, you need to supply the "
270  << "--fmllr-basis option.";
271  Vector<BaseFloat> basis_coeffs;
272  BaseFloat impr = basis.ComputeTransform(spk_stats,
274  &basis_coeffs, config_.basis_opts);
275  KALDI_VLOG(3) << "Objective function improvement from basis-fMLLR is "
276  << (impr / spk_stats.beta_) << " per frame, over "
277  << spk_stats.beta_ << " frames, #params estimated is "
278  << basis_coeffs.Dim();
280 }
const AmDiagGmm & GetModel() const
virtual int32 Dim() const
Member functions from OnlineFeatureInterface:
const BasisFmllrEstimate & GetFmllrBasis() const
int32 GetVerboseLevel()
Get verbosity level, usually set via command line &#39;–verbose=&#39; switch.
Definition: kaldi-error.h:60
OnlineGmmAdaptationState adaptation_state_
kaldi::int32 int32
bool GetGaussianPosteriors(bool end_of_utterance, GaussPost *gpost)
float BaseFloat
Definition: kaldi-types.h:29
virtual void GetFrame(int32 frame, VectorBase< BaseFloat > *feat)
Gets the feature vector for this frame.
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_WARN
Definition: kaldi-error.h:150
void GetAsMatrix(Matrix< BaseFloat > *feats)
OnlineFeaturePipeline * feature_pipeline_
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
void SetTransform(const MatrixBase< BaseFloat > &transform)
#define KALDI_VLOG(v)
Definition: kaldi-error.h:156
const OnlineGmmDecodingModels & models_
const OnlineGmmAdaptationState & orig_adaptation_state_
std::vector< std::vector< std::pair< int32, Vector< BaseFloat > > > > GaussPost
GaussPost is a typedef for storing Gaussian-level posteriors for an utterance.
Definition: posterior.h:51
LatticeFasterOnlineDecoder decoder_
double beta_
beta_ is the occupation count.

◆ FeaturePipeline()

OnlineFeaturePipeline& FeaturePipeline ( )
inline

Definition at line 224 of file online-gmm-decoding.h.

224 { return *feature_pipeline_; }
OnlineFeaturePipeline * feature_pipeline_

◆ FinalizeDecoding()

void FinalizeDecoding ( )

Finalize the decoding.

Cleanups and prunes remaining tokens, so the final result is faster to obtain.

Definition at line 105 of file online-gmm-decoding.cc.

References SingleUtteranceGmmDecoder::decoder_, and LatticeFasterDecoderTpl< FST, Token >::FinalizeDecoding().

105  {
107 }
void FinalizeDecoding()
This function may be optionally called after AdvanceDecoding(), when you do not plan to decode any fu...
LatticeFasterOnlineDecoder decoder_

◆ FinalRelativeCost()

BaseFloat FinalRelativeCost ( )
inline

This function outputs to "final_relative_cost", if non-NULL, a number >= 0 that will be close to zero if the final-probs were close to the best probs active on the final frame.

(the output to final_relative_cost is based on the first-pass decoding). If it's close to zero (e.g. < 5, as a guess), it means you reached the end of the grammar with good probability, which can be taken as a good sign that the input was OK.

Definition at line 276 of file online-gmm-decoding.h.

References kaldi::EndpointDetected().

276 { return decoder_.FinalRelativeCost(); }
BaseFloat FinalRelativeCost() const
FinalRelativeCost() serves the same purpose as ReachedFinal(), but gives more information.
LatticeFasterOnlineDecoder decoder_

◆ GetAdaptationState()

void GetAdaptationState ( OnlineGmmAdaptationState adaptation_state) const

Definition at line 287 of file online-gmm-decoding.cc.

References SingleUtteranceGmmDecoder::adaptation_state_, OnlineGmmAdaptationState::cmvn_state, SingleUtteranceGmmDecoder::feature_pipeline_, and OnlineFeaturePipeline::GetCmvnState().

288  {
289  *adaptation_state = adaptation_state_;
290  feature_pipeline_->GetCmvnState(&adaptation_state->cmvn_state);
291 }
OnlineGmmAdaptationState adaptation_state_
void GetCmvnState(OnlineCmvnState *cmvn_state)
OnlineFeaturePipeline * feature_pipeline_

◆ GetBestPath()

void GetBestPath ( bool  end_of_utterance,
Lattice best_path 
) const

Outputs an FST corresponding to the single best path through the current lattice.

If "use_final_probs" is true AND we reached the final-state of the graph then it will include those as final-probs, else it will treat all final-probs as one.

Definition at line 341 of file online-gmm-decoding.cc.

References SingleUtteranceGmmDecoder::decoder_, and LatticeFasterOnlineDecoderTpl< FST >::GetBestPath().

342  {
343  decoder_.GetBestPath(best_path, end_of_utterance);
344 }
bool GetBestPath(Lattice *ofst, bool use_final_probs=true) const
Outputs an FST corresponding to the single best path through the lattice.
LatticeFasterOnlineDecoder decoder_

◆ GetGaussianPosteriors()

bool GetGaussianPosteriors ( bool  end_of_utterance,
GaussPost gpost 
)
private

Definition at line 111 of file online-gmm-decoding.cc.

References DiagGmm::ComponentPosteriors(), SingleUtteranceGmmDecoder::config_, kaldi::ConvertPosteriorToPdfs(), SingleUtteranceGmmDecoder::decoder_, fst::DeterminizeLatticePruned(), OnlineFeaturePipeline::Dim(), SingleUtteranceGmmDecoder::feature_pipeline_, OnlineGmmDecodingConfig::fmllr_lattice_beam, OnlineFeaturePipeline::GetFrame(), OnlineGmmDecodingModels::GetModel(), OnlineGmmDecodingModels::GetOnlineAlignmentModel(), AmDiagGmm::GetPdf(), LatticeFasterOnlineDecoderTpl< FST >::GetRawLatticePruned(), OnlineGmmDecodingModels::GetTransitionModel(), SingleUtteranceGmmDecoder::HaveTransform(), rnnlm::i, rnnlm::j, KALDI_ASSERT, KALDI_VLOG, KALDI_WARN, kaldi::LatticeForwardBackward(), SingleUtteranceGmmDecoder::models_, LatticeFasterDecoderTpl< FST, Token >::NumFramesDecoded(), kaldi::PruneLattice(), VectorBase< Real >::Scale(), SingleUtteranceGmmDecoder::silence_phones_, OnlineGmmDecodingConfig::silence_weight, kaldi::TopSortLatticeIfNeeded(), and kaldi::WeightSilencePost().

Referenced by SingleUtteranceGmmDecoder::EstimateFmllr().

112  {
113  // Gets the Gaussian-level posteriors for this utterance, using whatever
114  // features and model we are currently decoding with. We'll use these
115  // to estimate basis-fMLLR with.
116  if (decoder_.NumFramesDecoded() == 0) {
117  KALDI_WARN << "You have decoded no data so cannot estimate fMLLR.";
118  return false;
119  }
120 
122 
123  // Note: we'll just use whatever acoustic scaling factor we were decoding
124  // with. This is in the lattice that we get from decoder_.GetRawLattice().
125  Lattice raw_lat;
126  decoder_.GetRawLatticePruned(&raw_lat, end_of_utterance,
128 
129  // At this point we could rescore the lattice if we wanted, and
130  // this might improve the accuracy on long utterances that were
131  // the first utterance of that speaker, if we had already
132  // estimated the fMLLR by the time we reach this code (e.g. this
133  // was the second call). We don't do this right now.
134 
136 
137 #if 1 // Do determinization.
138  Lattice det_lat; // lattice-determinized lattice-- represent this as Lattice
139  // not CompactLattice, as LatticeForwardBackward() does not
140  // accept CompactLattice.
141 
142 
143  fst::Invert(&raw_lat); // want to determinize on words.
144  fst::ILabelCompare<kaldi::LatticeArc> ilabel_comp;
145  fst::ArcSort(&raw_lat, ilabel_comp); // improves efficiency of determinization
146 
148  double(config_.fmllr_lattice_beam),
149  &det_lat);
150 
151  fst::Invert(&det_lat); // invert back.
152 
153  if (det_lat.NumStates() == 0) {
154  // Do nothing if the lattice is empty. This should not happen.
155  KALDI_WARN << "Got empty lattice. Not estimating fMLLR.";
156  return false;
157  }
158 #else
159  Lattice &det_lat = raw_lat; // Don't determinize.
160 #endif
161  TopSortLatticeIfNeeded(&det_lat);
162 
163  // Note: the acoustic scale we use here is whatever we decoded with.
164  Posterior post;
165  BaseFloat tot_fb_like = LatticeForwardBackward(det_lat, &post);
166 
167  KALDI_VLOG(3) << "Lattice forward-backward likelihood was "
168  << (tot_fb_like / post.size()) << " per frame over " << post.size()
169  << " frames.";
170 
171  ConstIntegerSet<int32> silence_set(silence_phones_); // faster lookup
172  const TransitionModel &trans_model = models_.GetTransitionModel();
173  WeightSilencePost(trans_model, silence_set,
174  config_.silence_weight, &post);
175 
176  const AmDiagGmm &am_gmm = (HaveTransform() ? models_.GetModel() :
178 
179 
180  Posterior pdf_post;
181  ConvertPosteriorToPdfs(trans_model, post, &pdf_post);
182 
183  Vector<BaseFloat> feat(feature_pipeline_->Dim());
184 
185  double tot_like = 0.0, tot_weight = 0.0;
186  gpost->resize(pdf_post.size());
187  for (size_t i = 0; i < pdf_post.size(); i++) {
188  feature_pipeline_->GetFrame(i, &feat);
189  for (size_t j = 0; j < pdf_post[i].size(); j++) {
190  int32 pdf_id = pdf_post[i][j].first;
191  BaseFloat weight = pdf_post[i][j].second;
192  const DiagGmm &gmm = am_gmm.GetPdf(pdf_id);
193  Vector<BaseFloat> this_post_vec;
194  BaseFloat like = gmm.ComponentPosteriors(feat, &this_post_vec);
195  this_post_vec.Scale(weight);
196  tot_like += like * weight;
197  tot_weight += weight;
198  (*gpost)[i].push_back(std::make_pair(pdf_id, this_post_vec));
199  }
200  }
201  KALDI_VLOG(3) << "Average likelihood weighted by posterior was "
202  << (tot_like / tot_weight) << " over " << tot_weight
203  << " frames (after downweighting silence).";
204  return true;
205 }
const AmDiagGmm & GetModel() const
virtual int32 Dim() const
Member functions from OnlineFeatureInterface:
bool DeterminizeLatticePruned(const ExpandedFst< ArcTpl< Weight > > &ifst, double beam, MutableFst< ArcTpl< CompactLatticeWeightTpl< Weight, IntType > > > *ofst, DeterminizeLatticePrunedOptions opts)
void TopSortLatticeIfNeeded(Lattice *lat)
Topologically sort the lattice if not already topologically sorted.
kaldi::int32 int32
void WeightSilencePost(const TransitionModel &trans_model, const ConstIntegerSet< int32 > &silence_set, BaseFloat silence_scale, Posterior *post)
Weight any silence phones in the posterior (i.e.
Definition: posterior.cc:375
const AmDiagGmm & GetOnlineAlignmentModel() const
float BaseFloat
Definition: kaldi-types.h:29
std::vector< std::vector< std::pair< int32, BaseFloat > > > Posterior
Posterior is a typedef for storing acoustic-state (actually, transition-id) posteriors over an uttera...
Definition: posterior.h:42
BaseFloat LatticeForwardBackward(const Lattice &lat, Posterior *post, double *acoustic_like_sum)
This function does the forward-backward over lattices and computes the posterior probabilities of the...
virtual void GetFrame(int32 frame, VectorBase< BaseFloat > *feat)
Gets the feature vector for this frame.
fst::VectorFst< LatticeArc > Lattice
Definition: kaldi-lattice.h:44
#define KALDI_WARN
Definition: kaldi-error.h:150
OnlineFeaturePipeline * feature_pipeline_
const TransitionModel & GetTransitionModel() const
bool PruneLattice(BaseFloat beam, LatType *lat)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define KALDI_VLOG(v)
Definition: kaldi-error.h:156
const OnlineGmmDecodingModels & models_
void ConvertPosteriorToPdfs(const TransitionModel &tmodel, const Posterior &post_in, Posterior *post_out)
Converts a posterior over transition-ids to be a posterior over pdf-ids.
Definition: posterior.cc:322
LatticeFasterOnlineDecoder decoder_
bool GetRawLatticePruned(Lattice *ofst, bool use_final_probs, BaseFloat beam) const
Behaves the same as GetRawLattice but only processes tokens whose extra_cost is smaller than the best...
bool HaveTransform() const
Returns true if we already have an fMLLR transform.

◆ GetLattice()

void GetLattice ( bool  rescore_if_needed,
bool  end_of_utterance,
CompactLattice clat 
) const

Gets the lattice.

If rescore_if_needed is true, and if there is any point in rescoring the state-level lattice (see RescoringIsNeeded()), it will rescore the lattice. The output lattice has any acoustic scaling in it (which will typically be desirable in an online-decoding context); if you want an un-scaled lattice, scale it using ScaleLattice() with the inverse of the acoustic weight. "end_of_utterance" will be true if you want the final-probs to be included.

Definition at line 318 of file online-gmm-decoding.cc.

References OnlineGmmDecodingConfig::acoustic_scale, SingleUtteranceGmmDecoder::config_, SingleUtteranceGmmDecoder::decoder_, LatticeFasterDecoderConfig::det_opts, fst::DeterminizeLatticePhonePrunedWrapper(), OnlineGmmDecodingConfig::faster_decoder_opts, SingleUtteranceGmmDecoder::feature_pipeline_, OnlineGmmDecodingModels::GetFinalModel(), LatticeFasterDecoderTpl< FST, Token >::GetRawLattice(), OnlineGmmDecodingModels::GetTransitionModel(), KALDI_WARN, LatticeFasterDecoderConfig::lattice_beam, SingleUtteranceGmmDecoder::models_, kaldi::PruneLattice(), kaldi::RescoreLattice(), and SingleUtteranceGmmDecoder::RescoringIsNeeded().

320  {
321  Lattice lat;
322  double lat_beam = config_.faster_decoder_opts.lattice_beam;
323  decoder_.GetRawLattice(&lat, end_of_utterance);
324  if (rescore_if_needed && RescoringIsNeeded()) {
325  DecodableDiagGmmScaledOnline decodable(models_.GetFinalModel(),
329 
330  if (!kaldi::RescoreLattice(&decodable, &lat))
331  KALDI_WARN << "Error rescoring lattice";
332  }
333  PruneLattice(lat_beam, &lat);
334 
336  &lat, lat_beam, clat,
338 
339 }
bool GetRawLattice(Lattice *ofst, bool use_final_probs=true) const
Outputs an FST corresponding to the raw, state-level tracebacks.
bool RescoringIsNeeded() const
Returns true if doing a lattice rescoring pass would have any point, i.e.
fst::VectorFst< LatticeArc > Lattice
Definition: kaldi-lattice.h:44
#define KALDI_WARN
Definition: kaldi-error.h:150
LatticeFasterDecoderConfig faster_decoder_opts
OnlineFeaturePipeline * feature_pipeline_
fst::DeterminizeLatticePhonePrunedOptions det_opts
const TransitionModel & GetTransitionModel() const
bool PruneLattice(BaseFloat beam, LatType *lat)
const OnlineGmmDecodingModels & models_
const AmDiagGmm & GetFinalModel() const
LatticeFasterOnlineDecoder decoder_
bool RescoreLattice(DecodableInterface *decodable, Lattice *lat)
This function *adds* the negated scores obtained from the Decodable object, to the acoustic scores on...
bool DeterminizeLatticePhonePrunedWrapper(const kaldi::TransitionModel &trans_model, MutableFst< kaldi::LatticeArc > *ifst, double beam, MutableFst< kaldi::CompactLatticeArc > *ofst, DeterminizeLatticePhonePrunedOptions opts)
This function is a wrapper of DeterminizeLatticePhonePruned() that works for Lattice type FSTs...

◆ HaveTransform()

bool HaveTransform ( ) const

Returns true if we already have an fMLLR transform.

The user will already know this; the call is for convenience.

Definition at line 283 of file online-gmm-decoding.cc.

References SingleUtteranceGmmDecoder::feature_pipeline_, and OnlineFeaturePipeline::HaveFmllrTransform().

Referenced by SingleUtteranceGmmDecoder::AdvanceDecoding(), and SingleUtteranceGmmDecoder::GetGaussianPosteriors().

283  {
285 }
OnlineFeaturePipeline * feature_pipeline_

◆ RescoringIsNeeded()

bool RescoringIsNeeded ( ) const
private

Returns true if doing a lattice rescoring pass would have any point, i.e.

if we have estimated fMLLR during this utterance, or if we have a discriminative model that differs from the fMLLR model *and* we currently have fMLLR features.

Definition at line 293 of file online-gmm-decoding.cc.

References SingleUtteranceGmmDecoder::adaptation_state_, MatrixBase< Real >::ApproxEqual(), OnlineGmmDecodingModels::GetFinalModel(), OnlineGmmDecodingModels::GetModel(), SingleUtteranceGmmDecoder::models_, MatrixBase< Real >::NumRows(), SingleUtteranceGmmDecoder::orig_adaptation_state_, and OnlineGmmAdaptationState::transform.

Referenced by SingleUtteranceGmmDecoder::GetLattice().

293  {
295  adaptation_state_.transform.NumRows()) return true; // fMLLR was estimated
297  adaptation_state_.transform)) return true; // fMLLR was re-estimated
298  if (adaptation_state_.transform.NumRows() != 0 &&
300  return true; // we have an fMLLR transform, and a discriminatively estimated
301  // model which differs from the one used to estimate fMLLR.
302  return false;
303 }
const AmDiagGmm & GetModel() const
bool ApproxEqual(const MatrixBase< Real > &other, float tol=0.01) const
Returns true if ((*this)-other).FrobeniusNorm() <= tol * (*this).FrobeniusNorm(). ...
OnlineGmmAdaptationState adaptation_state_
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
const OnlineGmmDecodingModels & models_
const OnlineGmmAdaptationState & orig_adaptation_state_
const AmDiagGmm & GetFinalModel() const

Member Data Documentation

◆ adaptation_state_

◆ config_

◆ decoder_

◆ feature_pipeline_

◆ models_

◆ orig_adaptation_state_

◆ silence_phones_

std::vector<int32> silence_phones_
private

The documentation for this class was generated from the following files: