All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
LatticeBiglmFasterDecoder Class Reference

This is as LatticeFasterDecoder, but does online composition between HCLG and the "difference language model", which is a deterministic FST that represents the difference between the language model you want and the language model you compiled HCLG with. More...

#include <lattice-biglm-faster-decoder.h>

Collaboration diagram for LatticeBiglmFasterDecoder:

Classes

struct  ForwardLink
 
struct  Token
 
struct  TokenList
 

Public Types

typedef fst::StdArc Arc
 
typedef Arc::Label Label
 
typedef Arc::StateId StateId
 
typedef uint64 PairId
 
typedef Arc::Weight Weight
 

Public Member Functions

 LatticeBiglmFasterDecoder (const fst::Fst< fst::StdArc > &fst, const LatticeBiglmFasterDecoderConfig &config, fst::DeterministicOnDemandFst< fst::StdArc > *lm_diff_fst)
 
void SetOptions (const LatticeBiglmFasterDecoderConfig &config)
 
LatticeBiglmFasterDecoderConfig GetOptions ()
 
 ~LatticeBiglmFasterDecoder ()
 
bool Decode (DecodableInterface *decodable)
 
bool ReachedFinal () const
 says whether a final-state was active on the last frame. More...
 
bool GetBestPath (fst::MutableFst< LatticeArc > *ofst, bool use_final_probs=true) const
 
bool GetRawLattice (fst::MutableFst< LatticeArc > *ofst, bool use_final_probs=true) const
 
bool GetLattice (fst::MutableFst< CompactLatticeArc > *ofst, bool use_final_probs=true) const
 

Private Types

typedef HashList< PairId,
Token * >::Elem 
Elem
 

Private Member Functions

PairId ConstructPair (StateId fst_state, StateId lm_state)
 
void PossiblyResizeHash (size_t num_toks)
 
TokenFindOrAddToken (PairId state_pair, int32 frame, BaseFloat tot_cost, bool emitting, bool *changed)
 
void PruneForwardLinks (int32 frame, bool *extra_costs_changed, bool *links_pruned, BaseFloat delta)
 
void PruneForwardLinksFinal (int32 frame)
 
void PruneTokensForFrame (int32 frame)
 
void PruneActiveTokens (int32 cur_frame, BaseFloat delta)
 
void PruneActiveTokensFinal (int32 cur_frame)
 
BaseFloat GetCutoff (Elem *list_head, size_t *tok_count, BaseFloat *adaptive_beam, Elem **best_elem)
 Gets the weight cutoff. Also counts the active tokens. More...
 
StateId PropagateLm (StateId lm_state, Arc *arc)
 
void ProcessEmitting (DecodableInterface *decodable, int32 frame)
 
void ProcessNonemitting (int32 frame)
 
void DeleteElems (Elem *list)
 
void ClearActiveTokens ()
 

Static Private Member Functions

static StateId PairToState (PairId state_pair)
 
static StateId PairToLmState (PairId state_pair)
 

Private Attributes

HashList< PairId, Token * > toks_
 
std::vector< TokenListactive_toks_
 
std::vector< PairIdqueue_
 
std::vector< BaseFloattmp_array_
 
const fst::Fst< fst::StdArc > & fst_
 
fst::DeterministicOnDemandFst
< fst::StdArc > * 
lm_diff_fst_
 
LatticeBiglmFasterDecoderConfig config_
 
bool warned_noarc_
 
int32 num_toks_
 
bool warned_
 
bool final_active_
 
std::map< Token *, BaseFloatfinal_costs_
 

Detailed Description

This is as LatticeFasterDecoder, but does online composition between HCLG and the "difference language model", which is a deterministic FST that represents the difference between the language model you want and the language model you compiled HCLG with.

The class DeterministicOnDemandFst follows through the epsilons in G for you (assuming G is a standard backoff language model) and makes it look like a determinized FST.

Definition at line 48 of file lattice-biglm-faster-decoder.h.

Member Typedef Documentation

typedef fst::StdArc Arc

Definition at line 50 of file lattice-biglm-faster-decoder.h.

typedef HashList<PairId, Token*>::Elem Elem
private

Definition at line 300 of file lattice-biglm-faster-decoder.h.

typedef Arc::Label Label

Definition at line 51 of file lattice-biglm-faster-decoder.h.

typedef uint64 PairId

Definition at line 54 of file lattice-biglm-faster-decoder.h.

typedef Arc::StateId StateId

Definition at line 52 of file lattice-biglm-faster-decoder.h.

typedef Arc::Weight Weight

Definition at line 55 of file lattice-biglm-faster-decoder.h.

Constructor & Destructor Documentation

LatticeBiglmFasterDecoder ( const fst::Fst< fst::StdArc > &  fst,
const LatticeBiglmFasterDecoderConfig config,
fst::DeterministicOnDemandFst< fst::StdArc > *  lm_diff_fst 
)
inline

Definition at line 57 of file lattice-biglm-faster-decoder.h.

References LatticeFasterDecoderConfig::Check(), KALDI_ASSERT, DeterministicOnDemandFst< Arc >::Start(), and LatticeBiglmFasterDecoder::toks_.

60  :
61  fst_(fst), lm_diff_fst_(lm_diff_fst), config_(config),
62  warned_noarc_(false), num_toks_(0) {
63  config.Check();
64  KALDI_ASSERT(fst.Start() != fst::kNoStateId &&
65  lm_diff_fst->Start() != fst::kNoStateId);
66  toks_.SetSize(1000); // just so on the first frame we do something reasonable.
67  }
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
Definition: graph.dox:21
const fst::Fst< fst::StdArc > & fst_
virtual StateId Start()=0
LatticeBiglmFasterDecoderConfig config_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169

Member Function Documentation

void ClearActiveTokens ( )
inlineprivate

Definition at line 866 of file lattice-biglm-faster-decoder.h.

References rnnlm::i, KALDI_ASSERT, and LatticeBiglmFasterDecoder::Token::next.

Referenced by LatticeBiglmFasterDecoder::Decode(), and LatticeBiglmFasterDecoder::~LatticeBiglmFasterDecoder().

866  { // a cleanup routine, at utt end/begin
867  for (size_t i = 0; i < active_toks_.size(); i++) {
868  // Delete all tokens alive on this frame, and any forward
869  // links they may have.
870  for (Token *tok = active_toks_[i].toks; tok != NULL; ) {
871  tok->DeleteForwardLinks();
872  Token *next_tok = tok->next;
873  delete tok;
874  num_toks_--;
875  tok = next_tok;
876  }
877  }
878  active_toks_.clear();
879  KALDI_ASSERT(num_toks_ == 0);
880  }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
PairId ConstructPair ( StateId  fst_state,
StateId  lm_state 
)
inlineprivate

Definition at line 229 of file lattice-biglm-faster-decoder.h.

Referenced by LatticeBiglmFasterDecoder::Decode(), LatticeBiglmFasterDecoder::ProcessEmitting(), and LatticeBiglmFasterDecoder::ProcessNonemitting().

229  {
230  return static_cast<PairId>(fst_state) + (static_cast<PairId>(lm_state) << 32);
231  }
bool Decode ( DecodableInterface decodable)
inline

Definition at line 77 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, LatticeBiglmFasterDecoder::ClearActiveTokens(), LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::ConstructPair(), LatticeBiglmFasterDecoder::DeleteElems(), LatticeBiglmFasterDecoder::final_active_, LatticeBiglmFasterDecoder::final_costs_, LatticeBiglmFasterDecoder::fst_, DecodableInterface::IsLastFrame(), LatticeFasterDecoderConfig::lattice_beam, LatticeBiglmFasterDecoder::lm_diff_fst_, LatticeBiglmFasterDecoder::num_toks_, LatticeBiglmFasterDecoder::ProcessEmitting(), LatticeBiglmFasterDecoder::ProcessNonemitting(), LatticeFasterDecoderConfig::prune_interval, LatticeBiglmFasterDecoder::PruneActiveTokens(), LatticeBiglmFasterDecoder::PruneActiveTokensFinal(), DeterministicOnDemandFst< Arc >::Start(), LatticeBiglmFasterDecoder::toks_, and LatticeBiglmFasterDecoder::warned_.

Referenced by kaldi::DecodeUtterance().

77  {
78  // clean up from last time:
79  DeleteElems(toks_.Clear());
81  warned_ = false;
82  final_active_ = false;
83  final_costs_.clear();
84  num_toks_ = 0;
85  PairId start_pair = ConstructPair(fst_.Start(), lm_diff_fst_->Start());
86  active_toks_.resize(1);
87  Token *start_tok = new Token(0.0, 0.0, NULL, NULL);
88  active_toks_[0].toks = start_tok;
89  toks_.Insert(start_pair, start_tok);
90  num_toks_++;
92 
93  // We use 1-based indexing for frames in this decoder (if you view it in
94  // terms of features), but note that the decodable object uses zero-based
95  // numbering, which we have to correct for when we call it.
96  for (int32 frame = 1; !decodable->IsLastFrame(frame-2); frame++) {
97  active_toks_.resize(frame+1); // new column
98 
99  ProcessEmitting(decodable, frame);
100 
101  ProcessNonemitting(frame);
102 
103  if (decodable->IsLastFrame(frame-1))
104  PruneActiveTokensFinal(frame);
105  else if (frame % config_.prune_interval == 0)
106  PruneActiveTokens(frame, config_.lattice_beam * 0.1); // use larger delta.
107  }
108  // Returns true if we have any kind of traceback available (not necessarily
109  // to the end state; query ReachedFinal() for that).
110  return !final_costs_.empty();
111  }
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
PairId ConstructPair(StateId fst_state, StateId lm_state)
const fst::Fst< fst::StdArc > & fst_
std::map< Token *, BaseFloat > final_costs_
virtual StateId Start()=0
LatticeBiglmFasterDecoderConfig config_
void PruneActiveTokens(int32 cur_frame, BaseFloat delta)
void ProcessEmitting(DecodableInterface *decodable, int32 frame)
void DeleteElems ( Elem list)
inlineprivate

Definition at line 858 of file lattice-biglm-faster-decoder.h.

References HashList< I, T >::Clear(), and HashList< I, T >::Delete().

Referenced by LatticeBiglmFasterDecoder::Decode(), and LatticeBiglmFasterDecoder::~LatticeBiglmFasterDecoder().

858  {
859  for (Elem *e = list, *e_tail; e != NULL; e = e_tail) {
860  e_tail = e->tail;
861  toks_.Delete(e);
862  }
863  toks_.Clear();
864  }
HashList< PairId, Token * >::Elem Elem
Token* FindOrAddToken ( PairId  state_pair,
int32  frame,
BaseFloat  tot_cost,
bool  emitting,
bool *  changed 
)
inlineprivate

Definition at line 315 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, KALDI_ASSERT, LatticeBiglmFasterDecoder::num_toks_, LatticeBiglmFasterDecoder::toks_, LatticeBiglmFasterDecoder::Token::tot_cost, and HashList< I, T >::Elem::val.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting(), and LatticeBiglmFasterDecoder::ProcessNonemitting().

316  {
317  // Returns the Token pointer. Sets "changed" (if non-NULL) to true
318  // if the token was newly created or the cost changed.
319  KALDI_ASSERT(frame < active_toks_.size());
320  Token *&toks = active_toks_[frame].toks;
321  Elem *e_found = toks_.Find(state_pair);
322  if (e_found == NULL) { // no such token presently.
323  const BaseFloat extra_cost = 0.0;
324  // tokens on the currently final frame have zero extra_cost
325  // as any of them could end up
326  // on the winning path.
327  Token *new_tok = new Token (tot_cost, extra_cost, NULL, toks);
328  // NULL: no forward links yet
329  toks = new_tok;
330  num_toks_++;
331  toks_.Insert(state_pair, new_tok);
332  if (changed) *changed = true;
333  return new_tok;
334  } else {
335  Token *tok = e_found->val; // There is an existing Token for this state.
336  if (tok->tot_cost > tot_cost) { // replace old token
337  tok->tot_cost = tot_cost;
338  // we don't allocate a new token, the old stays linked in active_toks_
339  // we only replace the tot_cost
340  // in the current frame, there are no forward links (and no extra_cost)
341  // only in ProcessNonemitting we have to delete forward links
342  // in case we visit a state for the second time
343  // those forward links, that lead to this replaced token before:
344  // they remain and will hopefully be pruned later (PruneForwardLinks...)
345  if (changed) *changed = true;
346  } else {
347  if (changed) *changed = false;
348  }
349  return tok;
350  }
351  }
HashList< PairId, Token * >::Elem Elem
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
bool GetBestPath ( fst::MutableFst< LatticeArc > *  ofst,
bool  use_final_probs = true 
) const
inline

Definition at line 120 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::GetRawLattice().

Referenced by kaldi::DecodeUtterance().

121  {
122  fst::VectorFst<LatticeArc> fst;
123  if (!GetRawLattice(&fst, use_final_probs)) return false;
124  // std::cout << "Raw lattice is:\n";
125  // fst::FstPrinter<LatticeArc> fstprinter(fst, NULL, NULL, NULL, false, true);
126  // fstprinter.Print(&std::cout, "standard output");
127  ShortestPath(fst, ofst);
128  return true;
129  }
bool GetRawLattice(fst::MutableFst< LatticeArc > *ofst, bool use_final_probs=true) const
Definition: graph.dox:21
BaseFloat GetCutoff ( Elem list_head,
size_t *  tok_count,
BaseFloat adaptive_beam,
Elem **  best_elem 
)
inlineprivate

Gets the weight cutoff. Also counts the active tokens.

Definition at line 616 of file lattice-biglm-faster-decoder.h.

References LatticeFasterDecoderConfig::beam, LatticeFasterDecoderConfig::beam_delta, LatticeBiglmFasterDecoder::config_, count, LatticeFasterDecoderConfig::max_active, HashList< I, T >::Elem::tail, and LatticeBiglmFasterDecoder::tmp_array_.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting().

617  {
618  BaseFloat best_weight = std::numeric_limits<BaseFloat>::infinity();
619  // positive == high cost == bad.
620  size_t count = 0;
621  if (config_.max_active == std::numeric_limits<int32>::max()) {
622  for (Elem *e = list_head; e != NULL; e = e->tail, count++) {
623  BaseFloat w = static_cast<BaseFloat>(e->val->tot_cost);
624  if (w < best_weight) {
625  best_weight = w;
626  if (best_elem) *best_elem = e;
627  }
628  }
629  if (tok_count != NULL) *tok_count = count;
630  if (adaptive_beam != NULL) *adaptive_beam = config_.beam;
631  return best_weight + config_.beam;
632  } else {
633  tmp_array_.clear();
634  for (Elem *e = list_head; e != NULL; e = e->tail, count++) {
635  BaseFloat w = e->val->tot_cost;
636  tmp_array_.push_back(w);
637  if (w < best_weight) {
638  best_weight = w;
639  if (best_elem) *best_elem = e;
640  }
641  }
642  if (tok_count != NULL) *tok_count = count;
643  if (tmp_array_.size() <= static_cast<size_t>(config_.max_active)) {
644  if (adaptive_beam) *adaptive_beam = config_.beam;
645  return best_weight + config_.beam;
646  } else {
647  // the lowest elements (lowest costs, highest likes)
648  // will be put in the left part of tmp_array.
649  std::nth_element(tmp_array_.begin(),
650  tmp_array_.begin()+config_.max_active,
651  tmp_array_.end());
652  // return the tighter of the two beams.
653  BaseFloat ans = std::min(best_weight + config_.beam,
654  *(tmp_array_.begin()+config_.max_active));
655  if (adaptive_beam)
656  *adaptive_beam = std::min(config_.beam,
657  ans - best_weight + config_.beam_delta);
658  return ans;
659  }
660  }
661  }
HashList< PairId, Token * >::Elem Elem
const size_t count
float BaseFloat
Definition: kaldi-types.h:29
LatticeBiglmFasterDecoderConfig config_
bool GetLattice ( fst::MutableFst< CompactLatticeArc > *  ofst,
bool  use_final_probs = true 
) const
inline

Definition at line 204 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::config_, LatticeFasterDecoderConfig::det_opts, fst::DeterminizeLatticePruned(), LatticeBiglmFasterDecoder::GetRawLattice(), KALDI_WARN, LatticeFasterDecoderConfig::lattice_beam, DeterminizeLatticePrunedOptions::max_mem, and DeterminizeLatticePhonePrunedOptions::max_mem.

205  {
206  Lattice raw_fst;
207  if (!GetRawLattice(&raw_fst, use_final_probs)) return false;
208  Invert(&raw_fst); // make it so word labels are on the input.
209  if (!TopSort(&raw_fst)) // topological sort makes lattice-determinization more efficient
210  KALDI_WARN << "Topological sorting of state-level lattice failed "
211  "(probably your lexicon has empty words or your LM has epsilon cycles; this "
212  " is a bad idea.)";
213  // (in phase where we get backward-costs).
214  fst::ILabelCompare<LatticeArc> ilabel_comp;
215  ArcSort(&raw_fst, ilabel_comp); // sort on ilabel; makes
216  // lattice-determinization more efficient.
217 
219  lat_opts.max_mem = config_.det_opts.max_mem;
220 
221  DeterminizeLatticePruned(raw_fst, config_.lattice_beam, ofst, lat_opts);
222  raw_fst.DeleteStates(); // Free memory-- raw_fst no longer needed.
223  Connect(ofst); // Remove unreachable states... there might be
224  // a small number of these, in some cases.
225  return true;
226  }
bool GetRawLattice(fst::MutableFst< LatticeArc > *ofst, bool use_final_probs=true) const
bool DeterminizeLatticePruned(const ExpandedFst< ArcTpl< Weight > > &ifst, double beam, MutableFst< ArcTpl< CompactLatticeWeightTpl< Weight, IntType > > > *ofst, DeterminizeLatticePrunedOptions opts)
fst::VectorFst< LatticeArc > Lattice
Definition: kaldi-lattice.h:44
#define KALDI_WARN
Definition: kaldi-error.h:130
LatticeBiglmFasterDecoderConfig config_
fst::DeterminizeLatticePhonePrunedOptions det_opts
LatticeBiglmFasterDecoderConfig GetOptions ( )
inline

Definition at line 69 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::config_.

Referenced by kaldi::DecodeUtterance().

69 { return config_; }
LatticeBiglmFasterDecoderConfig config_
bool GetRawLattice ( fst::MutableFst< LatticeArc > *  ofst,
bool  use_final_probs = true 
) const
inline

Definition at line 133 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, LatticeBiglmFasterDecoder::final_costs_, KALDI_ASSERT, KALDI_VLOG, KALDI_WARN, LatticeBiglmFasterDecoder::ForwardLink::next, LatticeBiglmFasterDecoder::num_toks_, and LatticeWeightTpl< BaseFloat >::One().

Referenced by kaldi::DecodeUtterance(), LatticeBiglmFasterDecoder::GetBestPath(), and LatticeBiglmFasterDecoder::GetLattice().

134  {
135  typedef LatticeArc Arc;
136  typedef Arc::StateId StateId;
137  // A PairId will be constructed as: (StateId in fst) + (StateId in lm_diff_fst) << 32;
138  typedef uint64 PairId;
139  typedef Arc::Weight Weight;
140  typedef Arc::Label Label;
141  ofst->DeleteStates();
142  // num-frames plus one (since frames are one-based, and we have
143  // an extra frame for the start-state).
144  int32 num_frames = active_toks_.size() - 1;
145  KALDI_ASSERT(num_frames > 0);
146  unordered_map<Token*, StateId> tok_map(num_toks_/2 + 3); // bucket count
147  // First create all states.
148  for (int32 f = 0; f <= num_frames; f++) {
149  if (active_toks_[f].toks == NULL) {
150  KALDI_WARN << "GetRawLattice: no tokens active on frame " << f
151  << ": not producing lattice.\n";
152  return false;
153  }
154  for (Token *tok = active_toks_[f].toks; tok != NULL; tok = tok->next)
155  tok_map[tok] = ofst->AddState();
156  // The next statement sets the start state of the output FST.
157  // Because we always add new states to the head of the list
158  // active_toks_[f].toks, and the start state was the first one
159  // added, it will be the last one added to ofst.
160  if (f == 0 && ofst->NumStates() > 0)
161  ofst->SetStart(ofst->NumStates()-1);
162  }
163  KALDI_VLOG(3) << "init:" << num_toks_/2 + 3 << " buckets:"
164  << tok_map.bucket_count() << " load:" << tok_map.load_factor()
165  << " max:" << tok_map.max_load_factor();
166  // Now create all arcs.
167  StateId cur_state = 0; // we rely on the fact that we numbered these
168  // consecutively (AddState() returns the numbers in order..)
169  for (int32 f = 0; f <= num_frames; f++) {
170  for (Token *tok = active_toks_[f].toks; tok != NULL; tok = tok->next,
171  cur_state++) {
172  for (ForwardLink *l = tok->links;
173  l != NULL;
174  l = l->next) {
175  unordered_map<Token*, StateId>::const_iterator iter =
176  tok_map.find(l->next_tok);
177  StateId nextstate = iter->second;
178  KALDI_ASSERT(iter != tok_map.end());
179  Arc arc(l->ilabel, l->olabel,
180  Weight(l->graph_cost, l->acoustic_cost),
181  nextstate);
182  ofst->AddArc(cur_state, arc);
183  }
184  if (f == num_frames) {
185  if (use_final_probs && !final_costs_.empty()) {
186  std::map<Token*, BaseFloat>::const_iterator iter =
187  final_costs_.find(tok);
188  if (iter != final_costs_.end())
189  ofst->SetFinal(cur_state, LatticeWeight(iter->second, 0));
190  } else {
191  ofst->SetFinal(cur_state, LatticeWeight::One());
192  }
193  }
194  }
195  }
196  KALDI_ASSERT(cur_state == ofst->NumStates());
197  return (cur_state != 0);
198  }
fst::StdArc::StateId StateId
fst::ArcTpl< LatticeWeight > LatticeArc
Definition: kaldi-lattice.h:40
std::map< Token *, BaseFloat > final_costs_
fst::LatticeWeightTpl< BaseFloat > LatticeWeight
Definition: kaldi-lattice.h:32
static const LatticeWeightTpl One()
#define KALDI_WARN
Definition: kaldi-error.h:130
fst::StdArc::Label Label
fst::StdArc::Weight Weight
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
#define KALDI_VLOG(v)
Definition: kaldi-error.h:136
static StateId PairToLmState ( PairId  state_pair)
inlinestaticprivate

Definition at line 236 of file lattice-biglm-faster-decoder.h.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting(), LatticeBiglmFasterDecoder::ProcessNonemitting(), and LatticeBiglmFasterDecoder::PruneForwardLinksFinal().

236  {
237  return static_cast<StateId>(static_cast<uint32>(state_pair >> 32));
238  }
static StateId PairToState ( PairId  state_pair)
inlinestaticprivate
void PossiblyResizeHash ( size_t  num_toks)
inlineprivate

Definition at line 302 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::config_, LatticeFasterDecoderConfig::hash_ratio, and LatticeBiglmFasterDecoder::toks_.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting().

302  {
303  size_t new_sz = static_cast<size_t>(static_cast<BaseFloat>(num_toks)
304  * config_.hash_ratio);
305  if (new_sz > toks_.Size()) {
306  toks_.SetSize(new_sz);
307  }
308  }
float BaseFloat
Definition: kaldi-types.h:29
LatticeBiglmFasterDecoderConfig config_
void ProcessEmitting ( DecodableInterface decodable,
int32  frame 
)
inlineprivate

Definition at line 687 of file lattice-biglm-faster-decoder.h.

References LatticeFasterDecoderConfig::beam, LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::ConstructPair(), LatticeBiglmFasterDecoder::FindOrAddToken(), LatticeBiglmFasterDecoder::fst_, LatticeBiglmFasterDecoder::GetCutoff(), HashList< I, T >::Elem::key, LatticeBiglmFasterDecoder::Token::links, DecodableInterface::LogLikelihood(), LatticeBiglmFasterDecoder::PairToLmState(), LatticeBiglmFasterDecoder::PairToState(), LatticeBiglmFasterDecoder::PossiblyResizeHash(), LatticeBiglmFasterDecoder::PropagateLm(), fst::Times(), LatticeBiglmFasterDecoder::toks_, LatticeBiglmFasterDecoder::Token::tot_cost, and HashList< I, T >::Elem::val.

Referenced by LatticeBiglmFasterDecoder::Decode().

687  {
688  // Processes emitting arcs for one frame. Propagates from prev_toks_ to cur_toks_.
689  Elem *last_toks = toks_.Clear(); // swapping prev_toks_ / cur_toks_
690  Elem *best_elem = NULL;
691  BaseFloat adaptive_beam;
692  size_t tok_cnt;
693  BaseFloat cur_cutoff = GetCutoff(last_toks, &tok_cnt, &adaptive_beam, &best_elem);
694  PossiblyResizeHash(tok_cnt); // This makes sure the hash is always big enough.
695 
696  BaseFloat next_cutoff = std::numeric_limits<BaseFloat>::infinity();
697  // pruning "online" before having seen all tokens
698 
699  // First process the best token to get a hopefully
700  // reasonably tight bound on the next cutoff.
701  if (best_elem) {
702  PairId state_pair = best_elem->key;
703  StateId state = PairToState(state_pair), // state in "fst"
704  lm_state = PairToLmState(state_pair);
705  Token *tok = best_elem->val;
706  for (fst::ArcIterator<fst::Fst<Arc> > aiter(fst_, state);
707  !aiter.Done();
708  aiter.Next()) {
709  Arc arc = aiter.Value();
710  if (arc.ilabel != 0) { // propagate..
711  PropagateLm(lm_state, &arc); // may affect "arc.weight".
712  // We don't need the return value (the new LM state).
713  arc.weight = Times(arc.weight,
714  Weight(-decodable->LogLikelihood(frame-1, arc.ilabel)));
715  BaseFloat new_weight = arc.weight.Value() + tok->tot_cost;
716  if (new_weight + adaptive_beam < next_cutoff)
717  next_cutoff = new_weight + adaptive_beam;
718  }
719  }
720  }
721 
722  // the tokens are now owned here, in last_toks, and the hash is empty.
723  // 'owned' is a complex thing here; the point is we need to call DeleteElem
724  // on each elem 'e' to let toks_ know we're done with them.
725  for (Elem *e = last_toks, *e_tail; e != NULL; e = e_tail) {
726  // loop this way because we delete "e" as we go.
727  PairId state_pair = e->key;
728  StateId state = PairToState(state_pair),
729  lm_state = PairToLmState(state_pair);
730  Token *tok = e->val;
731  if (tok->tot_cost <= cur_cutoff) {
732  for (fst::ArcIterator<fst::Fst<Arc> > aiter(fst_, state);
733  !aiter.Done();
734  aiter.Next()) {
735  const Arc &arc_ref = aiter.Value();
736  if (arc_ref.ilabel != 0) { // propagate..
737  Arc arc(arc_ref);
738  StateId next_lm_state = PropagateLm(lm_state, &arc);
739  BaseFloat ac_cost = -decodable->LogLikelihood(frame-1, arc.ilabel),
740  graph_cost = arc.weight.Value(),
741  cur_cost = tok->tot_cost,
742  tot_cost = cur_cost + ac_cost + graph_cost;
743  if (tot_cost > next_cutoff) continue;
744  else if (tot_cost + config_.beam < next_cutoff)
745  next_cutoff = tot_cost + config_.beam; // prune by best current token
746  PairId next_pair = ConstructPair(arc.nextstate, next_lm_state);
747  Token *next_tok = FindOrAddToken(next_pair, frame, tot_cost, true, NULL);
748  // true: emitting, NULL: no change indicator needed
749 
750  // Add ForwardLink from tok to next_tok (put on head of list tok->links)
751  tok->links = new ForwardLink(next_tok, arc.ilabel, arc.olabel,
752  graph_cost, ac_cost, tok->links);
753  }
754  } // for all arcs
755  }
756  e_tail = e->tail;
757  toks_.Delete(e); // delete Elem
758  }
759  }
StateId PropagateLm(StateId lm_state, Arc *arc)
PairId ConstructPair(StateId fst_state, StateId lm_state)
static StateId PairToLmState(PairId state_pair)
const fst::Fst< fst::StdArc > & fst_
HashList< PairId, Token * >::Elem Elem
Token * FindOrAddToken(PairId state_pair, int32 frame, BaseFloat tot_cost, bool emitting, bool *changed)
LatticeWeightTpl< FloatType > Times(const LatticeWeightTpl< FloatType > &w1, const LatticeWeightTpl< FloatType > &w2)
float BaseFloat
Definition: kaldi-types.h:29
LatticeBiglmFasterDecoderConfig config_
BaseFloat GetCutoff(Elem *list_head, size_t *tok_count, BaseFloat *adaptive_beam, Elem **best_elem)
Gets the weight cutoff. Also counts the active tokens.
static StateId PairToState(PairId state_pair)
void ProcessNonemitting ( int32  frame)
inlineprivate

Definition at line 761 of file lattice-biglm-faster-decoder.h.

References LatticeFasterDecoderConfig::beam, LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::ConstructPair(), LatticeBiglmFasterDecoder::Token::DeleteForwardLinks(), LatticeBiglmFasterDecoder::FindOrAddToken(), LatticeBiglmFasterDecoder::fst_, KALDI_ASSERT, KALDI_ERR, LatticeBiglmFasterDecoder::Token::links, LatticeBiglmFasterDecoder::PairToLmState(), LatticeBiglmFasterDecoder::PairToState(), LatticeBiglmFasterDecoder::PropagateLm(), LatticeBiglmFasterDecoder::queue_, LatticeBiglmFasterDecoder::toks_, LatticeBiglmFasterDecoder::Token::tot_cost, and LatticeBiglmFasterDecoder::warned_.

Referenced by LatticeBiglmFasterDecoder::Decode().

761  {
762  // note: "frame" is the same as emitting states just processed.
763 
764  // Processes nonemitting arcs for one frame. Propagates within toks_.
765  // Note-- this queue structure is is not very optimal as
766  // it may cause us to process states unnecessarily (e.g. more than once),
767  // but in the baseline code, turning this vector into a set to fix this
768  // problem did not improve overall speed.
769 
770  KALDI_ASSERT(queue_.empty());
771  BaseFloat best_cost = std::numeric_limits<BaseFloat>::infinity();
772  for (const Elem *e = toks_.GetList(); e != NULL; e = e->tail) {
773  queue_.push_back(e->key);
774  // for pruning with current best token
775  best_cost = std::min(best_cost, static_cast<BaseFloat>(e->val->tot_cost));
776  }
777  if (queue_.empty()) {
778  if (!warned_) {
779  KALDI_ERR << "Error in ProcessEmitting: no surviving tokens: frame is "
780  << frame;
781  warned_ = true;
782  }
783  }
784  BaseFloat cutoff = best_cost + config_.beam;
785 
786  while (!queue_.empty()) {
787  PairId state_pair = queue_.back();
788  queue_.pop_back();
789 
790  Token *tok = toks_.Find(state_pair)->val; // would segfault if state not in
791  // toks_ but this can't happen.
792  BaseFloat cur_cost = tok->tot_cost;
793  if (cur_cost > cutoff) // Don't bother processing successors.
794  continue;
795  StateId state = PairToState(state_pair),
796  lm_state = PairToLmState(state_pair);
797  // If "tok" has any existing forward links, delete them,
798  // because we're about to regenerate them. This is a kind
799  // of non-optimality (remember, this is the simple decoder),
800  // but since most states are emitting it's not a huge issue.
801  tok->DeleteForwardLinks(); // necessary when re-visiting
802  tok->links = NULL;
803  for (fst::ArcIterator<fst::Fst<Arc> > aiter(fst_, state);
804  !aiter.Done();
805  aiter.Next()) {
806  const Arc &arc_ref = aiter.Value();
807  if (arc_ref.ilabel == 0) { // propagate nonemitting only...
808  Arc arc(arc_ref);
809  StateId next_lm_state = PropagateLm(lm_state, &arc);
810  BaseFloat graph_cost = arc.weight.Value(),
811  tot_cost = cur_cost + graph_cost;
812  if (tot_cost < cutoff) {
813  bool changed;
814  PairId next_pair = ConstructPair(arc.nextstate, next_lm_state);
815  Token *new_tok = FindOrAddToken(next_pair, frame, tot_cost,
816  false, &changed); // false: non-emit
817 
818  tok->links = new ForwardLink(new_tok, 0, arc.olabel,
819  graph_cost, 0, tok->links);
820 
821  // "changed" tells us whether the new token has a different
822  // cost from before, or is new [if so, add into queue].
823  if (changed) queue_.push_back(next_pair);
824  }
825  }
826  } // for all arcs
827  } // while queue not empty
828  }
StateId PropagateLm(StateId lm_state, Arc *arc)
PairId ConstructPair(StateId fst_state, StateId lm_state)
static StateId PairToLmState(PairId state_pair)
const fst::Fst< fst::StdArc > & fst_
HashList< PairId, Token * >::Elem Elem
Token * FindOrAddToken(PairId state_pair, int32 frame, BaseFloat tot_cost, bool emitting, bool *changed)
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:127
LatticeBiglmFasterDecoderConfig config_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
static StateId PairToState(PairId state_pair)
StateId PropagateLm ( StateId  lm_state,
Arc arc 
)
inlineprivate

Definition at line 663 of file lattice-biglm-faster-decoder.h.

References DeterministicOnDemandFst< Arc >::GetArc(), KALDI_WARN, LatticeBiglmFasterDecoder::lm_diff_fst_, fst::Times(), and LatticeBiglmFasterDecoder::warned_noarc_.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting(), and LatticeBiglmFasterDecoder::ProcessNonemitting().

664  { // returns new LM state.
665  if (arc->olabel == 0) {
666  return lm_state; // no change in LM state if no word crossed.
667  } else { // Propagate in the LM-diff FST.
668  Arc lm_arc;
669  bool ans = lm_diff_fst_->GetArc(lm_state, arc->olabel, &lm_arc);
670  if (!ans) { // this case is unexpected for statistical LMs.
671  if (!warned_noarc_) {
672  warned_noarc_ = true;
673  KALDI_WARN << "No arc available in LM (unlikely to be correct "
674  "if a statistical language model); will not warn again";
675  }
676  arc->weight = Weight::Zero();
677  return lm_state; // doesn't really matter what we return here; will
678  // be pruned.
679  } else {
680  arc->weight = Times(arc->weight, lm_arc.weight);
681  arc->olabel = lm_arc.olabel; // probably will be the same.
682  return lm_arc.nextstate; // return the new LM state.
683  }
684  }
685  }
virtual bool GetArc(StateId s, Label ilabel, Arc *oarc)=0
Note: ilabel must not be epsilon.
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
LatticeWeightTpl< FloatType > Times(const LatticeWeightTpl< FloatType > &w1, const LatticeWeightTpl< FloatType > &w2)
#define KALDI_WARN
Definition: kaldi-error.h:130
void PruneActiveTokens ( int32  cur_frame,
BaseFloat  delta 
)
inlineprivate

Definition at line 566 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, KALDI_VLOG, LatticeBiglmFasterDecoder::num_toks_, LatticeBiglmFasterDecoder::PruneForwardLinks(), and LatticeBiglmFasterDecoder::PruneTokensForFrame().

Referenced by LatticeBiglmFasterDecoder::Decode().

566  {
567  int32 num_toks_begin = num_toks_;
568  for (int32 frame = cur_frame-1; frame >= 0; frame--) {
569  // Reason why we need to prune forward links in this situation:
570  // (1) we have never pruned them (new TokenList)
571  // (2) we have not yet pruned the forward links to the next frame,
572  // after any of those tokens have changed their extra_cost.
573  if (active_toks_[frame].must_prune_forward_links) {
574  bool extra_costs_changed = false, links_pruned = false;
575  PruneForwardLinks(frame, &extra_costs_changed, &links_pruned, delta);
576  if (extra_costs_changed && frame > 0) // any token has changed extra_cost
577  active_toks_[frame-1].must_prune_forward_links = true;
578  if (links_pruned) // any link was pruned
579  active_toks_[frame].must_prune_tokens = true;
580  active_toks_[frame].must_prune_forward_links = false; // job done
581  }
582  if (frame+1 < cur_frame && // except for last frame (no forward links)
583  active_toks_[frame+1].must_prune_tokens) {
584  PruneTokensForFrame(frame+1);
585  active_toks_[frame+1].must_prune_tokens = false;
586  }
587  }
588  KALDI_VLOG(3) << "PruneActiveTokens: pruned tokens from " << num_toks_begin
589  << " to " << num_toks_;
590  }
void PruneForwardLinks(int32 frame, bool *extra_costs_changed, bool *links_pruned, BaseFloat delta)
#define KALDI_VLOG(v)
Definition: kaldi-error.h:136
void PruneActiveTokensFinal ( int32  cur_frame)
inlineprivate

Definition at line 594 of file lattice-biglm-faster-decoder.h.

References KALDI_VLOG, LatticeBiglmFasterDecoder::num_toks_, LatticeBiglmFasterDecoder::PruneForwardLinks(), LatticeBiglmFasterDecoder::PruneForwardLinksFinal(), and LatticeBiglmFasterDecoder::PruneTokensForFrame().

Referenced by LatticeBiglmFasterDecoder::Decode().

594  {
595  // returns true if there were final states active
596  // else returns false and treats all states as final while doing the pruning
597  // (this can be useful if you want partial lattice output,
598  // although it can be dangerous, depending what you want the lattices for).
599  // final_active_ and final_probs_ (a hash) are set internally
600  // by PruneForwardLinksFinal
601  int32 num_toks_begin = num_toks_;
602  PruneForwardLinksFinal(cur_frame); // prune final frame (with final-probs)
603  // sets final_active_ and final_probs_
604  for (int32 frame = cur_frame-1; frame >= 0; frame--) {
605  bool b1, b2; // values not used.
606  BaseFloat dontcare = 0.0; // delta of zero means we must always update
607  PruneForwardLinks(frame, &b1, &b2, dontcare);
608  PruneTokensForFrame(frame+1);
609  }
611  KALDI_VLOG(3) << "PruneActiveTokensFinal: pruned tokens from " << num_toks_begin
612  << " to " << num_toks_;
613  }
void PruneForwardLinks(int32 frame, bool *extra_costs_changed, bool *links_pruned, BaseFloat delta)
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_VLOG(v)
Definition: kaldi-error.h:136
void PruneForwardLinks ( int32  frame,
bool *  extra_costs_changed,
bool *  links_pruned,
BaseFloat  delta 
)
inlineprivate

Definition at line 356 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::ForwardLink::acoustic_cost, LatticeBiglmFasterDecoder::active_toks_, LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::Token::extra_cost, LatticeBiglmFasterDecoder::ForwardLink::graph_cost, KALDI_ASSERT, KALDI_WARN, LatticeFasterDecoderConfig::lattice_beam, LatticeBiglmFasterDecoder::ForwardLink::next, LatticeBiglmFasterDecoder::ForwardLink::next_tok, LatticeBiglmFasterDecoder::Token::tot_cost, and LatticeBiglmFasterDecoder::warned_.

Referenced by LatticeBiglmFasterDecoder::PruneActiveTokens(), and LatticeBiglmFasterDecoder::PruneActiveTokensFinal().

358  {
359  // delta is the amount by which the extra_costs must change
360  // If delta is larger, we'll tend to go back less far
361  // toward the beginning of the file.
362  // extra_costs_changed is set to true if extra_cost was changed for any token
363  // links_pruned is set to true if any link in any token was pruned
364 
365  *extra_costs_changed = false;
366  *links_pruned = false;
367  KALDI_ASSERT(frame >= 0 && frame < active_toks_.size());
368  if (active_toks_[frame].toks == NULL ) { // empty list; should not happen.
369  if (!warned_) {
370  KALDI_WARN << "No tokens alive [doing pruning].. warning first "
371  "time only for each utterance\n";
372  warned_ = true;
373  }
374  }
375 
376  // We have to iterate until there is no more change, because the links
377  // are not guaranteed to be in topological order.
378  bool changed = true; // difference new minus old extra cost >= delta ?
379  while (changed) {
380  changed = false;
381  for (Token *tok = active_toks_[frame].toks; tok != NULL; tok = tok->next) {
382  ForwardLink *link, *prev_link=NULL;
383  // will recompute tok_extra_cost for tok.
384  BaseFloat tok_extra_cost = std::numeric_limits<BaseFloat>::infinity();
385  // tok_extra_cost is the best (min) of link_extra_cost of outgoing links
386  for (link = tok->links; link != NULL; ) {
387  // See if we need to excise this link...
388  Token *next_tok = link->next_tok;
389  BaseFloat link_extra_cost = next_tok->extra_cost +
390  ((tok->tot_cost + link->acoustic_cost + link->graph_cost)
391  - next_tok->tot_cost); // difference in brackets is >= 0
392  // link_exta_cost is the difference in score between the best paths
393  // through link source state and through link destination state
394  KALDI_ASSERT(link_extra_cost == link_extra_cost); // check for NaN
395  if (link_extra_cost > config_.lattice_beam) { // excise link
396  ForwardLink *next_link = link->next;
397  if (prev_link != NULL) prev_link->next = next_link;
398  else tok->links = next_link;
399  delete link;
400  link = next_link; // advance link but leave prev_link the same.
401  *links_pruned = true;
402  } else { // keep the link and update the tok_extra_cost if needed.
403  if (link_extra_cost < 0.0) { // this is just a precaution.
404  if (link_extra_cost < -0.01)
405  KALDI_WARN << "Negative extra_cost: " << link_extra_cost;
406  link_extra_cost = 0.0;
407  }
408  if (link_extra_cost < tok_extra_cost)
409  tok_extra_cost = link_extra_cost;
410  prev_link = link; // move to next link
411  link = link->next;
412  }
413  } // for all outgoing links
414  if (fabs(tok_extra_cost - tok->extra_cost) > delta)
415  changed = true; // difference new minus old is bigger than delta
416  tok->extra_cost = tok_extra_cost;
417  // will be +infinity or <= lattice_beam_.
418  // infinity indicates, that no forward link survived pruning
419  } // for all Token on active_toks_[frame]
420  if (changed) *extra_costs_changed = true;
421 
422  // Note: it's theoretically possible that aggressive compiler
423  // optimizations could cause an infinite loop here for small delta and
424  // high-dynamic-range scores.
425  } // while changed
426  }
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_WARN
Definition: kaldi-error.h:130
LatticeBiglmFasterDecoderConfig config_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void PruneForwardLinksFinal ( int32  frame)
inlineprivate

Definition at line 431 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::ForwardLink::acoustic_cost, LatticeBiglmFasterDecoder::active_toks_, kaldi::ApproxEqual(), LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::Token::extra_cost, DeterministicOnDemandFst< Arc >::Final(), LatticeBiglmFasterDecoder::final_active_, LatticeBiglmFasterDecoder::final_costs_, LatticeBiglmFasterDecoder::fst_, LatticeBiglmFasterDecoder::ForwardLink::graph_cost, KALDI_ASSERT, KALDI_WARN, LatticeFasterDecoderConfig::lattice_beam, LatticeBiglmFasterDecoder::lm_diff_fst_, LatticeBiglmFasterDecoder::ForwardLink::next, LatticeBiglmFasterDecoder::ForwardLink::next_tok, LatticeBiglmFasterDecoder::PairToLmState(), LatticeBiglmFasterDecoder::PairToState(), LatticeBiglmFasterDecoder::toks_, and LatticeBiglmFasterDecoder::Token::tot_cost.

Referenced by LatticeBiglmFasterDecoder::PruneActiveTokensFinal().

431  {
432  KALDI_ASSERT(static_cast<size_t>(frame+1) == active_toks_.size());
433  if (active_toks_[frame].toks == NULL ) // empty list; should not happen.
434  KALDI_WARN << "No tokens alive at end of file\n";
435 
436  // First go through, working out the best token (do it in parallel
437  // including final-probs and not including final-probs; we'll take
438  // the one with final-probs if it's valid).
439  const BaseFloat infinity = std::numeric_limits<BaseFloat>::infinity();
440  BaseFloat best_cost_final = infinity,
441  best_cost_nofinal = infinity;
442  unordered_map<Token*, BaseFloat> tok_to_final_cost;
443  Elem *cur_toks = toks_.Clear(); // swapping prev_toks_ / cur_toks_
444  for (Elem *e = cur_toks, *e_tail; e != NULL; e = e_tail) {
445  PairId state_pair = e->key;
446  StateId state = PairToState(state_pair),
447  lm_state = PairToLmState(state_pair);
448  Token *tok = e->val;
449  BaseFloat final_cost = fst_.Final(state).Value() +
450  lm_diff_fst_->Final(lm_state).Value();
451  tok_to_final_cost[tok] = final_cost;
452  best_cost_final = std::min(best_cost_final, tok->tot_cost + final_cost);
453  best_cost_nofinal = std::min(best_cost_nofinal, tok->tot_cost);
454  e_tail = e->tail;
455  toks_.Delete(e);
456  }
457  final_active_ = (best_cost_final != infinity);
458 
459  // Now go through tokens on this frame, pruning forward links... may have
460  // to iterate a few times until there is no more change, because the list is
461  // not in topological order.
462 
463  bool changed = true;
464  BaseFloat delta = 1.0e-05;
465  while (changed) {
466  changed = false;
467  for (Token *tok = active_toks_[frame].toks; tok != NULL; tok = tok->next) {
468  ForwardLink *link, *prev_link=NULL;
469  // will recompute tok_extra_cost. It has a term in it that corresponds
470  // to the "final-prob", so instead of initializing tok_extra_cost to infinity
471  // below we set it to the difference between the (score+final_prob) of this token,
472  // and the best such (score+final_prob).
473  BaseFloat tok_extra_cost;
474  if (final_active_) {
475  BaseFloat final_cost = tok_to_final_cost[tok];
476  tok_extra_cost = (tok->tot_cost + final_cost) - best_cost_final;
477  } else
478  tok_extra_cost = tok->tot_cost - best_cost_nofinal;
479 
480  for (link = tok->links; link != NULL; ) {
481  // See if we need to excise this link...
482  Token *next_tok = link->next_tok;
483  BaseFloat link_extra_cost = next_tok->extra_cost +
484  ((tok->tot_cost + link->acoustic_cost + link->graph_cost)
485  - next_tok->tot_cost);
486  if (link_extra_cost > config_.lattice_beam) { // excise link
487  ForwardLink *next_link = link->next;
488  if (prev_link != NULL) prev_link->next = next_link;
489  else tok->links = next_link;
490  delete link;
491  link = next_link; // advance link but leave prev_link the same.
492  } else { // keep the link and update the tok_extra_cost if needed.
493  if (link_extra_cost < 0.0) { // this is just a precaution.
494  if (link_extra_cost < -0.01)
495  KALDI_WARN << "Negative extra_cost: " << link_extra_cost;
496  link_extra_cost = 0.0;
497  }
498  if (link_extra_cost < tok_extra_cost)
499  tok_extra_cost = link_extra_cost;
500  prev_link = link;
501  link = link->next;
502  }
503  }
504  // prune away tokens worse than lattice_beam above best path. This step
505  // was not necessary in the non-final case because then, this case
506  // showed up as having no forward links. Here, the tok_extra_cost has
507  // an extra component relating to the final-prob.
508  if (tok_extra_cost > config_.lattice_beam)
509  tok_extra_cost = infinity;
510  // to be pruned in PruneTokensForFrame
511 
512  if (!ApproxEqual(tok->extra_cost, tok_extra_cost, delta))
513  changed = true;
514  tok->extra_cost = tok_extra_cost; // will be +infinity or <= lattice_beam_.
515  }
516  } // while changed
517 
518  // Now put surviving Tokens in the final_costs_ hash, which is a class
519  // member (unlike tok_to_final_costs).
520  for (Token *tok = active_toks_[frame].toks; tok != NULL; tok = tok->next) {
521  if (tok->extra_cost != infinity) {
522  // If the token was not pruned away,
523  if (final_active_) {
524  BaseFloat final_cost = tok_to_final_cost[tok];
525  if (final_cost != infinity)
526  final_costs_[tok] = final_cost;
527  } else {
528  final_costs_[tok] = 0;
529  }
530  }
531  }
532  }
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
virtual Weight Final(StateId s)=0
static StateId PairToLmState(PairId state_pair)
const fst::Fst< fst::StdArc > & fst_
HashList< PairId, Token * >::Elem Elem
std::map< Token *, BaseFloat > final_costs_
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_WARN
Definition: kaldi-error.h:130
LatticeBiglmFasterDecoderConfig config_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
static StateId PairToState(PairId state_pair)
static bool ApproxEqual(float a, float b, float relative_tolerance=0.001)
return abs(a - b) <= relative_tolerance * (abs(a)+abs(b)).
Definition: kaldi-math.h:262
void PruneTokensForFrame ( int32  frame)
inlineprivate

Definition at line 538 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, LatticeBiglmFasterDecoder::Token::extra_cost, KALDI_ASSERT, KALDI_WARN, LatticeBiglmFasterDecoder::Token::next, and LatticeBiglmFasterDecoder::num_toks_.

Referenced by LatticeBiglmFasterDecoder::PruneActiveTokens(), and LatticeBiglmFasterDecoder::PruneActiveTokensFinal().

538  {
539  KALDI_ASSERT(frame >= 0 && frame < active_toks_.size());
540  Token *&toks = active_toks_[frame].toks;
541  if (toks == NULL)
542  KALDI_WARN << "No tokens alive [doing pruning]\n";
543  Token *tok, *next_tok, *prev_tok = NULL;
544  for (tok = toks; tok != NULL; tok = next_tok) {
545  next_tok = tok->next;
546  if (tok->extra_cost == std::numeric_limits<BaseFloat>::infinity()) {
547  // token is unreachable from end of graph; (no forward links survived)
548  // excise tok from list and delete tok.
549  if (prev_tok != NULL) prev_tok->next = tok->next;
550  else toks = tok->next;
551  delete tok;
552  num_toks_--;
553  } else { // fetch next Token
554  prev_tok = tok;
555  }
556  }
557  }
#define KALDI_WARN
Definition: kaldi-error.h:130
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
bool ReachedFinal ( ) const
inline

says whether a final-state was active on the last frame.

If it was not, the lattice (or traceback) will end with states that are not final-states.

Definition at line 115 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::final_active_.

Referenced by kaldi::DecodeUtterance().

void SetOptions ( const LatticeBiglmFasterDecoderConfig config)
inline

Definition at line 68 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::config_.

68 { config_ = config; }
LatticeBiglmFasterDecoderConfig config_

Member Data Documentation

std::vector<PairId> queue_
private
std::vector<BaseFloat> tmp_array_
private
bool warned_noarc_
private

The documentation for this class was generated from the following file: