LatticeBiglmFasterDecoder Class Reference

This is as LatticeFasterDecoder, but does online composition between HCLG and the "difference language model", which is a deterministic FST that represents the difference between the language model you want and the language model you compiled HCLG with. More...

#include <lattice-biglm-faster-decoder.h>

Collaboration diagram for LatticeBiglmFasterDecoder:

Classes

struct  ForwardLink
 
struct  Token
 
struct  TokenList
 

Public Types

typedef fst::StdArc Arc
 
typedef Arc::Label Label
 
typedef Arc::StateId StateId
 
typedef uint64 PairId
 
typedef Arc::Weight Weight
 

Public Member Functions

 LatticeBiglmFasterDecoder (const fst::Fst< fst::StdArc > &fst, const LatticeBiglmFasterDecoderConfig &config, fst::DeterministicOnDemandFst< fst::StdArc > *lm_diff_fst)
 
void SetOptions (const LatticeBiglmFasterDecoderConfig &config)
 
LatticeBiglmFasterDecoderConfig GetOptions ()
 
 ~LatticeBiglmFasterDecoder ()
 
bool Decode (DecodableInterface *decodable)
 
bool ReachedFinal () const
 says whether a final-state was active on the last frame. More...
 
bool GetBestPath (fst::MutableFst< LatticeArc > *ofst, bool use_final_probs=true) const
 
bool GetRawLattice (fst::MutableFst< LatticeArc > *ofst, bool use_final_probs=true) const
 
bool GetLattice (fst::MutableFst< CompactLatticeArc > *ofst, bool use_final_probs=true) const
 

Private Types

typedef HashList< PairId, Token * >::Elem Elem
 

Private Member Functions

PairId ConstructPair (StateId fst_state, StateId lm_state)
 
void PossiblyResizeHash (size_t num_toks)
 
ElemFindOrAddToken (PairId state_pair, int32 frame, BaseFloat tot_cost, bool emitting, bool *changed)
 
void PruneForwardLinks (int32 frame, bool *extra_costs_changed, bool *links_pruned, BaseFloat delta)
 
void PruneForwardLinksFinal (int32 frame)
 
void PruneTokensForFrame (int32 frame)
 
void PruneActiveTokens (int32 cur_frame, BaseFloat delta)
 
void PruneActiveTokensFinal (int32 cur_frame)
 
BaseFloat GetCutoff (Elem *list_head, size_t *tok_count, BaseFloat *adaptive_beam, Elem **best_elem)
 Gets the weight cutoff. Also counts the active tokens. More...
 
StateId PropagateLm (StateId lm_state, Arc *arc)
 
void ProcessEmitting (DecodableInterface *decodable, int32 frame)
 
void ProcessNonemitting (int32 frame)
 
void DeleteElems (Elem *list)
 
void ClearActiveTokens ()
 

Static Private Member Functions

static StateId PairToState (PairId state_pair)
 
static StateId PairToLmState (PairId state_pair)
 

Private Attributes

HashList< PairId, Token * > toks_
 
std::vector< TokenListactive_toks_
 
std::vector< const Elem *> queue_
 
std::vector< BaseFloattmp_array_
 
const fst::Fst< fst::StdArc > & fst_
 
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
 
LatticeBiglmFasterDecoderConfig config_
 
bool warned_noarc_
 
int32 num_toks_
 
bool warned_
 
bool final_active_
 
std::map< Token *, BaseFloatfinal_costs_
 

Detailed Description

This is as LatticeFasterDecoder, but does online composition between HCLG and the "difference language model", which is a deterministic FST that represents the difference between the language model you want and the language model you compiled HCLG with.

The class DeterministicOnDemandFst follows through the epsilons in G for you (assuming G is a standard backoff language model) and makes it look like a determinized FST.

Definition at line 48 of file lattice-biglm-faster-decoder.h.

Member Typedef Documentation

◆ Arc

typedef fst::StdArc Arc

Definition at line 50 of file lattice-biglm-faster-decoder.h.

◆ Elem

typedef HashList<PairId, Token*>::Elem Elem
private

Definition at line 300 of file lattice-biglm-faster-decoder.h.

◆ Label

typedef Arc::Label Label

Definition at line 51 of file lattice-biglm-faster-decoder.h.

◆ PairId

typedef uint64 PairId

Definition at line 54 of file lattice-biglm-faster-decoder.h.

◆ StateId

typedef Arc::StateId StateId

Definition at line 52 of file lattice-biglm-faster-decoder.h.

◆ Weight

typedef Arc::Weight Weight

Definition at line 55 of file lattice-biglm-faster-decoder.h.

Constructor & Destructor Documentation

◆ LatticeBiglmFasterDecoder()

LatticeBiglmFasterDecoder ( const fst::Fst< fst::StdArc > &  fst,
const LatticeBiglmFasterDecoderConfig config,
fst::DeterministicOnDemandFst< fst::StdArc > *  lm_diff_fst 
)
inline

Definition at line 57 of file lattice-biglm-faster-decoder.h.

References LatticeFasterDecoderConfig::Check(), KALDI_ASSERT, DeterministicOnDemandFst< Arc >::Start(), and LatticeBiglmFasterDecoder::toks_.

60  :
61  fst_(fst), lm_diff_fst_(lm_diff_fst), config_(config),
62  warned_noarc_(false), num_toks_(0) {
63  config.Check();
64  KALDI_ASSERT(fst.Start() != fst::kNoStateId &&
65  lm_diff_fst->Start() != fst::kNoStateId);
66  toks_.SetSize(1000); // just so on the first frame we do something reasonable.
67  }
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
For an extended explanation of the framework of which grammar-fsts are a part, please see Support for...
Definition: graph.dox:21
const fst::Fst< fst::StdArc > & fst_
virtual StateId Start()=0
LatticeBiglmFasterDecoderConfig config_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ ~LatticeBiglmFasterDecoder()

Member Function Documentation

◆ ClearActiveTokens()

void ClearActiveTokens ( )
inlineprivate

Definition at line 867 of file lattice-biglm-faster-decoder.h.

References rnnlm::i, KALDI_ASSERT, LatticeBiglmFasterDecoder::Token::next, and LatticeBiglmFasterDecoder::ForwardLink::next_tok.

Referenced by LatticeBiglmFasterDecoder::Decode(), and LatticeBiglmFasterDecoder::~LatticeBiglmFasterDecoder().

867  { // a cleanup routine, at utt end/begin
868  for (size_t i = 0; i < active_toks_.size(); i++) {
869  // Delete all tokens alive on this frame, and any forward
870  // links they may have.
871  for (Token *tok = active_toks_[i].toks; tok != NULL; ) {
872  tok->DeleteForwardLinks();
873  Token *next_tok = tok->next;
874  delete tok;
875  num_toks_--;
876  tok = next_tok;
877  }
878  }
879  active_toks_.clear();
880  KALDI_ASSERT(num_toks_ == 0);
881  }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ ConstructPair()

PairId ConstructPair ( StateId  fst_state,
StateId  lm_state 
)
inlineprivate

Definition at line 229 of file lattice-biglm-faster-decoder.h.

Referenced by LatticeBiglmFasterDecoder::Decode(), LatticeBiglmFasterDecoder::ProcessEmitting(), and LatticeBiglmFasterDecoder::ProcessNonemitting().

229  {
230  return static_cast<PairId>(fst_state) + (static_cast<PairId>(lm_state) << 32);
231  }

◆ Decode()

bool Decode ( DecodableInterface decodable)
inline

Definition at line 77 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, LatticeBiglmFasterDecoder::ClearActiveTokens(), LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::ConstructPair(), LatticeBiglmFasterDecoder::DeleteElems(), LatticeBiglmFasterDecoder::final_active_, LatticeBiglmFasterDecoder::final_costs_, LatticeBiglmFasterDecoder::fst_, DecodableInterface::IsLastFrame(), LatticeFasterDecoderConfig::lattice_beam, LatticeBiglmFasterDecoder::lm_diff_fst_, LatticeBiglmFasterDecoder::num_toks_, LatticeBiglmFasterDecoder::ProcessEmitting(), LatticeBiglmFasterDecoder::ProcessNonemitting(), LatticeFasterDecoderConfig::prune_interval, LatticeBiglmFasterDecoder::PruneActiveTokens(), LatticeBiglmFasterDecoder::PruneActiveTokensFinal(), DeterministicOnDemandFst< Arc >::Start(), LatticeBiglmFasterDecoder::toks_, and LatticeBiglmFasterDecoder::warned_.

Referenced by kaldi::DecodeUtterance().

77  {
78  // clean up from last time:
79  DeleteElems(toks_.Clear());
81  warned_ = false;
82  final_active_ = false;
83  final_costs_.clear();
84  num_toks_ = 0;
85  PairId start_pair = ConstructPair(fst_.Start(), lm_diff_fst_->Start());
86  active_toks_.resize(1);
87  Token *start_tok = new Token(0.0, 0.0, NULL, NULL);
88  active_toks_[0].toks = start_tok;
89  toks_.Insert(start_pair, start_tok);
90  num_toks_++;
92 
93  // We use 1-based indexing for frames in this decoder (if you view it in
94  // terms of features), but note that the decodable object uses zero-based
95  // numbering, which we have to correct for when we call it.
96  for (int32 frame = 1; !decodable->IsLastFrame(frame-2); frame++) {
97  active_toks_.resize(frame+1); // new column
98 
99  ProcessEmitting(decodable, frame);
100 
101  ProcessNonemitting(frame);
102 
103  if (decodable->IsLastFrame(frame-1))
104  PruneActiveTokensFinal(frame);
105  else if (frame % config_.prune_interval == 0)
106  PruneActiveTokens(frame, config_.lattice_beam * 0.1); // use larger delta.
107  }
108  // Returns true if we have any kind of traceback available (not necessarily
109  // to the end state; query ReachedFinal() for that).
110  return !final_costs_.empty();
111  }
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
PairId ConstructPair(StateId fst_state, StateId lm_state)
const fst::Fst< fst::StdArc > & fst_
kaldi::int32 int32
std::map< Token *, BaseFloat > final_costs_
virtual StateId Start()=0
LatticeBiglmFasterDecoderConfig config_
void PruneActiveTokens(int32 cur_frame, BaseFloat delta)
void ProcessEmitting(DecodableInterface *decodable, int32 frame)

◆ DeleteElems()

void DeleteElems ( Elem list)
inlineprivate

Definition at line 859 of file lattice-biglm-faster-decoder.h.

References HashList< I, T >::Clear(), and HashList< I, T >::Delete().

Referenced by LatticeBiglmFasterDecoder::Decode(), and LatticeBiglmFasterDecoder::~LatticeBiglmFasterDecoder().

859  {
860  for (Elem *e = list, *e_tail; e != NULL; e = e_tail) {
861  e_tail = e->tail;
862  toks_.Delete(e);
863  }
864  toks_.Clear();
865  }
HashList< PairId, Token * >::Elem Elem

◆ FindOrAddToken()

Elem* FindOrAddToken ( PairId  state_pair,
int32  frame,
BaseFloat  tot_cost,
bool  emitting,
bool changed 
)
inlineprivate

Definition at line 315 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, KALDI_ASSERT, LatticeBiglmFasterDecoder::num_toks_, LatticeBiglmFasterDecoder::toks_, LatticeBiglmFasterDecoder::Token::tot_cost, and HashList< I, T >::Elem::val.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting(), and LatticeBiglmFasterDecoder::ProcessNonemitting().

316  {
317  // Returns the Token pointer. Sets "changed" (if non-NULL) to true
318  // if the token was newly created or the cost changed.
319  KALDI_ASSERT(frame < active_toks_.size());
320  Token *&toks = active_toks_[frame].toks;
321  Elem *e_found = toks_.Insert(state_pair, NULL);
322  if (e_found->val == NULL) { // no such token presently.
323  const BaseFloat extra_cost = 0.0;
324  // tokens on the currently final frame have zero extra_cost
325  // as any of them could end up
326  // on the winning path.
327  Token *new_tok = new Token (tot_cost, extra_cost, NULL, toks);
328  // NULL: no forward links yet
329  toks = new_tok;
330  num_toks_++;
331  e_found->val = new_tok;
332  if (changed) *changed = true;
333  return e_found;
334  } else {
335  Token *tok = e_found->val; // There is an existing Token for this state.
336  if (tok->tot_cost > tot_cost) { // replace old token
337  tok->tot_cost = tot_cost;
338  // we don't allocate a new token, the old stays linked in active_toks_
339  // we only replace the tot_cost
340  // in the current frame, there are no forward links (and no extra_cost)
341  // only in ProcessNonemitting we have to delete forward links
342  // in case we visit a state for the second time
343  // those forward links, that lead to this replaced token before:
344  // they remain and will hopefully be pruned later (PruneForwardLinks...)
345  if (changed) *changed = true;
346  } else {
347  if (changed) *changed = false;
348  }
349  return e_found;
350  }
351  }
HashList< PairId, Token * >::Elem Elem
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ GetBestPath()

bool GetBestPath ( fst::MutableFst< LatticeArc > *  ofst,
bool  use_final_probs = true 
) const
inline

Definition at line 120 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::GetRawLattice().

Referenced by kaldi::DecodeUtterance().

121  {
122  fst::VectorFst<LatticeArc> fst;
123  if (!GetRawLattice(&fst, use_final_probs)) return false;
124  // std::cout << "Raw lattice is:\n";
125  // fst::FstPrinter<LatticeArc> fstprinter(fst, NULL, NULL, NULL, false, true);
126  // fstprinter.Print(&std::cout, "standard output");
127  ShortestPath(fst, ofst);
128  return true;
129  }
For an extended explanation of the framework of which grammar-fsts are a part, please see Support for...
Definition: graph.dox:21
bool GetRawLattice(fst::MutableFst< LatticeArc > *ofst, bool use_final_probs=true) const

◆ GetCutoff()

BaseFloat GetCutoff ( Elem list_head,
size_t tok_count,
BaseFloat adaptive_beam,
Elem **  best_elem 
)
inlineprivate

Gets the weight cutoff. Also counts the active tokens.

Definition at line 616 of file lattice-biglm-faster-decoder.h.

References LatticeFasterDecoderConfig::beam, LatticeFasterDecoderConfig::beam_delta, LatticeBiglmFasterDecoder::config_, count, LatticeFasterDecoderConfig::max_active, HashList< I, T >::Elem::tail, and LatticeBiglmFasterDecoder::tmp_array_.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting().

617  {
618  BaseFloat best_weight = std::numeric_limits<BaseFloat>::infinity();
619  // positive == high cost == bad.
620  size_t count = 0;
621  if (config_.max_active == std::numeric_limits<int32>::max()) {
622  for (Elem *e = list_head; e != NULL; e = e->tail, count++) {
623  BaseFloat w = static_cast<BaseFloat>(e->val->tot_cost);
624  if (w < best_weight) {
625  best_weight = w;
626  if (best_elem) *best_elem = e;
627  }
628  }
629  if (tok_count != NULL) *tok_count = count;
630  if (adaptive_beam != NULL) *adaptive_beam = config_.beam;
631  return best_weight + config_.beam;
632  } else {
633  tmp_array_.clear();
634  for (Elem *e = list_head; e != NULL; e = e->tail, count++) {
635  BaseFloat w = e->val->tot_cost;
636  tmp_array_.push_back(w);
637  if (w < best_weight) {
638  best_weight = w;
639  if (best_elem) *best_elem = e;
640  }
641  }
642  if (tok_count != NULL) *tok_count = count;
643  if (tmp_array_.size() <= static_cast<size_t>(config_.max_active)) {
644  if (adaptive_beam) *adaptive_beam = config_.beam;
645  return best_weight + config_.beam;
646  } else {
647  // the lowest elements (lowest costs, highest likes)
648  // will be put in the left part of tmp_array.
649  std::nth_element(tmp_array_.begin(),
650  tmp_array_.begin()+config_.max_active,
651  tmp_array_.end());
652  // return the tighter of the two beams.
653  BaseFloat ans = std::min(best_weight + config_.beam,
654  *(tmp_array_.begin()+config_.max_active));
655  if (adaptive_beam)
656  *adaptive_beam = std::min(config_.beam,
657  ans - best_weight + config_.beam_delta);
658  return ans;
659  }
660  }
661  }
HashList< PairId, Token * >::Elem Elem
const size_t count
float BaseFloat
Definition: kaldi-types.h:29
LatticeBiglmFasterDecoderConfig config_

◆ GetLattice()

bool GetLattice ( fst::MutableFst< CompactLatticeArc > *  ofst,
bool  use_final_probs = true 
) const
inline

Definition at line 204 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::config_, LatticeFasterDecoderConfig::det_opts, fst::DeterminizeLatticePruned(), LatticeBiglmFasterDecoder::GetRawLattice(), KALDI_WARN, LatticeFasterDecoderConfig::lattice_beam, DeterminizeLatticePrunedOptions::max_mem, and DeterminizeLatticePhonePrunedOptions::max_mem.

205  {
206  Lattice raw_fst;
207  if (!GetRawLattice(&raw_fst, use_final_probs)) return false;
208  Invert(&raw_fst); // make it so word labels are on the input.
209  if (!TopSort(&raw_fst)) // topological sort makes lattice-determinization more efficient
210  KALDI_WARN << "Topological sorting of state-level lattice failed "
211  "(probably your lexicon has empty words or your LM has epsilon cycles; this "
212  " is a bad idea.)";
213  // (in phase where we get backward-costs).
214  fst::ILabelCompare<LatticeArc> ilabel_comp;
215  ArcSort(&raw_fst, ilabel_comp); // sort on ilabel; makes
216  // lattice-determinization more efficient.
217 
219  lat_opts.max_mem = config_.det_opts.max_mem;
220 
221  DeterminizeLatticePruned(raw_fst, config_.lattice_beam, ofst, lat_opts);
222  raw_fst.DeleteStates(); // Free memory-- raw_fst no longer needed.
223  Connect(ofst); // Remove unreachable states... there might be
224  // a small number of these, in some cases.
225  return true;
226  }
bool DeterminizeLatticePruned(const ExpandedFst< ArcTpl< Weight > > &ifst, double beam, MutableFst< ArcTpl< CompactLatticeWeightTpl< Weight, IntType > > > *ofst, DeterminizeLatticePrunedOptions opts)
fst::VectorFst< LatticeArc > Lattice
Definition: kaldi-lattice.h:44
#define KALDI_WARN
Definition: kaldi-error.h:150
bool GetRawLattice(fst::MutableFst< LatticeArc > *ofst, bool use_final_probs=true) const
LatticeBiglmFasterDecoderConfig config_
fst::DeterminizeLatticePhonePrunedOptions det_opts

◆ GetOptions()

LatticeBiglmFasterDecoderConfig GetOptions ( )
inline

Definition at line 69 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::config_.

Referenced by kaldi::DecodeUtterance().

69 { return config_; }
LatticeBiglmFasterDecoderConfig config_

◆ GetRawLattice()

bool GetRawLattice ( fst::MutableFst< LatticeArc > *  ofst,
bool  use_final_probs = true 
) const
inline

Definition at line 133 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, LatticeBiglmFasterDecoder::final_costs_, KALDI_ASSERT, KALDI_VLOG, KALDI_WARN, LatticeBiglmFasterDecoder::ForwardLink::next, LatticeBiglmFasterDecoder::num_toks_, and LatticeWeightTpl< BaseFloat >::One().

Referenced by kaldi::DecodeUtterance(), LatticeBiglmFasterDecoder::GetBestPath(), and LatticeBiglmFasterDecoder::GetLattice().

134  {
135  typedef LatticeArc Arc;
136  typedef Arc::StateId StateId;
137  // A PairId will be constructed as: (StateId in fst) + (StateId in lm_diff_fst) << 32;
138  typedef uint64 PairId;
139  typedef Arc::Weight Weight;
140  typedef Arc::Label Label;
141  ofst->DeleteStates();
142  // num-frames plus one (since frames are one-based, and we have
143  // an extra frame for the start-state).
144  int32 num_frames = active_toks_.size() - 1;
145  KALDI_ASSERT(num_frames > 0);
146  unordered_map<Token*, StateId> tok_map(num_toks_/2 + 3); // bucket count
147  // First create all states.
148  for (int32 f = 0; f <= num_frames; f++) {
149  if (active_toks_[f].toks == NULL) {
150  KALDI_WARN << "GetRawLattice: no tokens active on frame " << f
151  << ": not producing lattice.\n";
152  return false;
153  }
154  for (Token *tok = active_toks_[f].toks; tok != NULL; tok = tok->next)
155  tok_map[tok] = ofst->AddState();
156  // The next statement sets the start state of the output FST.
157  // Because we always add new states to the head of the list
158  // active_toks_[f].toks, and the start state was the first one
159  // added, it will be the last one added to ofst.
160  if (f == 0 && ofst->NumStates() > 0)
161  ofst->SetStart(ofst->NumStates()-1);
162  }
163  KALDI_VLOG(3) << "init:" << num_toks_/2 + 3 << " buckets:"
164  << tok_map.bucket_count() << " load:" << tok_map.load_factor()
165  << " max:" << tok_map.max_load_factor();
166  // Now create all arcs.
167  StateId cur_state = 0; // we rely on the fact that we numbered these
168  // consecutively (AddState() returns the numbers in order..)
169  for (int32 f = 0; f <= num_frames; f++) {
170  for (Token *tok = active_toks_[f].toks; tok != NULL; tok = tok->next,
171  cur_state++) {
172  for (ForwardLink *l = tok->links;
173  l != NULL;
174  l = l->next) {
175  unordered_map<Token*, StateId>::const_iterator iter =
176  tok_map.find(l->next_tok);
177  StateId nextstate = iter->second;
178  KALDI_ASSERT(iter != tok_map.end());
179  Arc arc(l->ilabel, l->olabel,
180  Weight(l->graph_cost, l->acoustic_cost),
181  nextstate);
182  ofst->AddArc(cur_state, arc);
183  }
184  if (f == num_frames) {
185  if (use_final_probs && !final_costs_.empty()) {
186  std::map<Token*, BaseFloat>::const_iterator iter =
187  final_costs_.find(tok);
188  if (iter != final_costs_.end())
189  ofst->SetFinal(cur_state, LatticeWeight(iter->second, 0));
190  } else {
191  ofst->SetFinal(cur_state, LatticeWeight::One());
192  }
193  }
194  }
195  }
196  KALDI_ASSERT(cur_state == ofst->NumStates());
197  return (cur_state != 0);
198  }
fst::StdArc::StateId StateId
fst::ArcTpl< LatticeWeight > LatticeArc
Definition: kaldi-lattice.h:40
static const LatticeWeightTpl One()
kaldi::int32 int32
std::map< Token *, BaseFloat > final_costs_
fst::LatticeWeightTpl< BaseFloat > LatticeWeight
Definition: kaldi-lattice.h:32
#define KALDI_WARN
Definition: kaldi-error.h:150
fst::StdArc::Label Label
fst::StdArc::Weight Weight
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define KALDI_VLOG(v)
Definition: kaldi-error.h:156

◆ PairToLmState()

static StateId PairToLmState ( PairId  state_pair)
inlinestaticprivate

◆ PairToState()

static StateId PairToState ( PairId  state_pair)
inlinestaticprivate

◆ PossiblyResizeHash()

void PossiblyResizeHash ( size_t  num_toks)
inlineprivate

Definition at line 302 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::config_, LatticeFasterDecoderConfig::hash_ratio, and LatticeBiglmFasterDecoder::toks_.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting().

302  {
303  size_t new_sz = static_cast<size_t>(static_cast<BaseFloat>(num_toks)
304  * config_.hash_ratio);
305  if (new_sz > toks_.Size()) {
306  toks_.SetSize(new_sz);
307  }
308  }
float BaseFloat
Definition: kaldi-types.h:29
LatticeBiglmFasterDecoderConfig config_

◆ ProcessEmitting()

void ProcessEmitting ( DecodableInterface decodable,
int32  frame 
)
inlineprivate

Definition at line 687 of file lattice-biglm-faster-decoder.h.

References LatticeFasterDecoderConfig::beam, LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::ConstructPair(), LatticeBiglmFasterDecoder::FindOrAddToken(), LatticeBiglmFasterDecoder::ForwardLink::ForwardLink(), LatticeBiglmFasterDecoder::fst_, LatticeBiglmFasterDecoder::GetCutoff(), LatticeBiglmFasterDecoder::ForwardLink::graph_cost, HashList< I, T >::Elem::key, LatticeBiglmFasterDecoder::Token::links, DecodableInterface::LogLikelihood(), LatticeBiglmFasterDecoder::PairToLmState(), LatticeBiglmFasterDecoder::PairToState(), LatticeBiglmFasterDecoder::PossiblyResizeHash(), LatticeBiglmFasterDecoder::PropagateLm(), fst::Times(), LatticeBiglmFasterDecoder::toks_, LatticeBiglmFasterDecoder::Token::tot_cost, and HashList< I, T >::Elem::val.

Referenced by LatticeBiglmFasterDecoder::Decode().

687  {
688  // Processes emitting arcs for one frame. Propagates from prev_toks_ to cur_toks_.
689  Elem *last_toks = toks_.Clear(); // swapping prev_toks_ / cur_toks_
690  Elem *best_elem = NULL;
691  BaseFloat adaptive_beam;
692  size_t tok_cnt;
693  BaseFloat cur_cutoff = GetCutoff(last_toks, &tok_cnt, &adaptive_beam, &best_elem);
694  PossiblyResizeHash(tok_cnt); // This makes sure the hash is always big enough.
695 
696  BaseFloat next_cutoff = std::numeric_limits<BaseFloat>::infinity();
697  // pruning "online" before having seen all tokens
698 
699  // First process the best token to get a hopefully
700  // reasonably tight bound on the next cutoff.
701  if (best_elem) {
702  PairId state_pair = best_elem->key;
703  StateId state = PairToState(state_pair), // state in "fst"
704  lm_state = PairToLmState(state_pair);
705  Token *tok = best_elem->val;
706  for (fst::ArcIterator<fst::Fst<Arc> > aiter(fst_, state);
707  !aiter.Done();
708  aiter.Next()) {
709  Arc arc = aiter.Value();
710  if (arc.ilabel != 0) { // propagate..
711  PropagateLm(lm_state, &arc); // may affect "arc.weight".
712  // We don't need the return value (the new LM state).
713  arc.weight = Times(arc.weight,
714  Weight(-decodable->LogLikelihood(frame-1, arc.ilabel)));
715  BaseFloat new_weight = arc.weight.Value() + tok->tot_cost;
716  if (new_weight + adaptive_beam < next_cutoff)
717  next_cutoff = new_weight + adaptive_beam;
718  }
719  }
720  }
721 
722  // the tokens are now owned here, in last_toks, and the hash is empty.
723  // 'owned' is a complex thing here; the point is we need to call DeleteElem
724  // on each elem 'e' to let toks_ know we're done with them.
725  for (Elem *e = last_toks, *e_tail; e != NULL; e = e_tail) {
726  // loop this way because we delete "e" as we go.
727  PairId state_pair = e->key;
728  StateId state = PairToState(state_pair),
729  lm_state = PairToLmState(state_pair);
730  Token *tok = e->val;
731  if (tok->tot_cost <= cur_cutoff) {
732  for (fst::ArcIterator<fst::Fst<Arc> > aiter(fst_, state);
733  !aiter.Done();
734  aiter.Next()) {
735  const Arc &arc_ref = aiter.Value();
736  if (arc_ref.ilabel != 0) { // propagate..
737  Arc arc(arc_ref);
738  StateId next_lm_state = PropagateLm(lm_state, &arc);
739  BaseFloat ac_cost = -decodable->LogLikelihood(frame-1, arc.ilabel),
740  graph_cost = arc.weight.Value(),
741  cur_cost = tok->tot_cost,
742  tot_cost = cur_cost + ac_cost + graph_cost;
743  if (tot_cost >= next_cutoff) continue;
744  else if (tot_cost + config_.beam < next_cutoff)
745  next_cutoff = tot_cost + config_.beam; // prune by best current token
746  PairId next_pair = ConstructPair(arc.nextstate, next_lm_state);
747  Elem *e_next = FindOrAddToken(next_pair, frame, tot_cost, true, NULL);
748  // true: emitting, NULL: no change indicator needed
749 
750  // Add ForwardLink from tok to next_tok (put on head of list tok->links)
751  tok->links = new ForwardLink(e_next->val, arc.ilabel, arc.olabel,
752  graph_cost, ac_cost, tok->links);
753  }
754  } // for all arcs
755  }
756  e_tail = e->tail;
757  toks_.Delete(e); // delete Elem
758  }
759  }
StateId PropagateLm(StateId lm_state, Arc *arc)
PairId ConstructPair(StateId fst_state, StateId lm_state)
static StateId PairToLmState(PairId state_pair)
const fst::Fst< fst::StdArc > & fst_
HashList< PairId, Token * >::Elem Elem
LatticeWeightTpl< FloatType > Times(const LatticeWeightTpl< FloatType > &w1, const LatticeWeightTpl< FloatType > &w2)
float BaseFloat
Definition: kaldi-types.h:29
Elem * FindOrAddToken(PairId state_pair, int32 frame, BaseFloat tot_cost, bool emitting, bool *changed)
LatticeBiglmFasterDecoderConfig config_
BaseFloat GetCutoff(Elem *list_head, size_t *tok_count, BaseFloat *adaptive_beam, Elem **best_elem)
Gets the weight cutoff. Also counts the active tokens.
static StateId PairToState(PairId state_pair)

◆ ProcessNonemitting()

void ProcessNonemitting ( int32  frame)
inlineprivate

Definition at line 761 of file lattice-biglm-faster-decoder.h.

References LatticeFasterDecoderConfig::beam, LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::ConstructPair(), LatticeBiglmFasterDecoder::Token::DeleteForwardLinks(), LatticeBiglmFasterDecoder::FindOrAddToken(), LatticeBiglmFasterDecoder::ForwardLink::ForwardLink(), LatticeBiglmFasterDecoder::fst_, LatticeBiglmFasterDecoder::ForwardLink::graph_cost, KALDI_ASSERT, KALDI_ERR, HashList< I, T >::Elem::key, LatticeBiglmFasterDecoder::Token::links, LatticeBiglmFasterDecoder::PairToLmState(), LatticeBiglmFasterDecoder::PairToState(), LatticeBiglmFasterDecoder::PropagateLm(), LatticeBiglmFasterDecoder::queue_, LatticeBiglmFasterDecoder::toks_, LatticeBiglmFasterDecoder::Token::tot_cost, HashList< I, T >::Elem::val, and LatticeBiglmFasterDecoder::warned_.

Referenced by LatticeBiglmFasterDecoder::Decode().

761  {
762  // note: "frame" is the same as emitting states just processed.
763 
764  // Processes nonemitting arcs for one frame. Propagates within toks_.
765  // Note-- this queue structure is is not very optimal as
766  // it may cause us to process states unnecessarily (e.g. more than once),
767  // but in the baseline code, turning this vector into a set to fix this
768  // problem did not improve overall speed.
769 
770  KALDI_ASSERT(queue_.empty());
771  BaseFloat best_cost = std::numeric_limits<BaseFloat>::infinity();
772  for (const Elem *e = toks_.GetList(); e != NULL; e = e->tail) {
773  queue_.push_back(e);
774  // for pruning with current best token
775  best_cost = std::min(best_cost, static_cast<BaseFloat>(e->val->tot_cost));
776  }
777  if (queue_.empty()) {
778  if (!warned_) {
779  KALDI_ERR << "Error in ProcessEmitting: no surviving tokens: frame is "
780  << frame;
781  warned_ = true;
782  }
783  }
784  BaseFloat cutoff = best_cost + config_.beam;
785 
786  while (!queue_.empty()) {
787  const Elem *e = queue_.back();
788  queue_.pop_back();
789 
790  PairId state_pair = e->key;
791  Token *tok = e->val; // would segfault if state not in
792  // toks_ but this can't happen.
793  BaseFloat cur_cost = tok->tot_cost;
794  if (cur_cost >= cutoff) // Don't bother processing successors.
795  continue;
796  StateId state = PairToState(state_pair),
797  lm_state = PairToLmState(state_pair);
798  // If "tok" has any existing forward links, delete them,
799  // because we're about to regenerate them. This is a kind
800  // of non-optimality (remember, this is the simple decoder),
801  // but since most states are emitting it's not a huge issue.
802  tok->DeleteForwardLinks(); // necessary when re-visiting
803  tok->links = NULL;
804  for (fst::ArcIterator<fst::Fst<Arc> > aiter(fst_, state);
805  !aiter.Done();
806  aiter.Next()) {
807  const Arc &arc_ref = aiter.Value();
808  if (arc_ref.ilabel == 0) { // propagate nonemitting only...
809  Arc arc(arc_ref);
810  StateId next_lm_state = PropagateLm(lm_state, &arc);
811  BaseFloat graph_cost = arc.weight.Value(),
812  tot_cost = cur_cost + graph_cost;
813  if (tot_cost < cutoff) {
814  bool changed;
815  PairId next_pair = ConstructPair(arc.nextstate, next_lm_state);
816  Elem *e_new = FindOrAddToken(next_pair, frame, tot_cost,
817  false, &changed); // false: non-emit
818 
819  tok->links = new ForwardLink(e_new->val, 0, arc.olabel,
820  graph_cost, 0, tok->links);
821 
822  // "changed" tells us whether the new token has a different
823  // cost from before, or is new [if so, add into queue].
824  if (changed) queue_.push_back(e_new);
825  }
826  }
827  } // for all arcs
828  } // while queue not empty
829  }
StateId PropagateLm(StateId lm_state, Arc *arc)
PairId ConstructPair(StateId fst_state, StateId lm_state)
static StateId PairToLmState(PairId state_pair)
const fst::Fst< fst::StdArc > & fst_
HashList< PairId, Token * >::Elem Elem
float BaseFloat
Definition: kaldi-types.h:29
Elem * FindOrAddToken(PairId state_pair, int32 frame, BaseFloat tot_cost, bool emitting, bool *changed)
#define KALDI_ERR
Definition: kaldi-error.h:147
LatticeBiglmFasterDecoderConfig config_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static StateId PairToState(PairId state_pair)

◆ PropagateLm()

StateId PropagateLm ( StateId  lm_state,
Arc arc 
)
inlineprivate

Definition at line 663 of file lattice-biglm-faster-decoder.h.

References DeterministicOnDemandFst< Arc >::GetArc(), KALDI_WARN, LatticeBiglmFasterDecoder::lm_diff_fst_, fst::Times(), and LatticeBiglmFasterDecoder::warned_noarc_.

Referenced by LatticeBiglmFasterDecoder::ProcessEmitting(), and LatticeBiglmFasterDecoder::ProcessNonemitting().

664  { // returns new LM state.
665  if (arc->olabel == 0) {
666  return lm_state; // no change in LM state if no word crossed.
667  } else { // Propagate in the LM-diff FST.
668  Arc lm_arc;
669  bool ans = lm_diff_fst_->GetArc(lm_state, arc->olabel, &lm_arc);
670  if (!ans) { // this case is unexpected for statistical LMs.
671  if (!warned_noarc_) {
672  warned_noarc_ = true;
673  KALDI_WARN << "No arc available in LM (unlikely to be correct "
674  "if a statistical language model); will not warn again";
675  }
676  arc->weight = Weight::Zero();
677  return lm_state; // doesn't really matter what we return here; will
678  // be pruned.
679  } else {
680  arc->weight = Times(arc->weight, lm_arc.weight);
681  arc->olabel = lm_arc.olabel; // probably will be the same.
682  return lm_arc.nextstate; // return the new LM state.
683  }
684  }
685  }
virtual bool GetArc(StateId s, Label ilabel, Arc *oarc)=0
Note: ilabel must not be epsilon.
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
LatticeWeightTpl< FloatType > Times(const LatticeWeightTpl< FloatType > &w1, const LatticeWeightTpl< FloatType > &w2)
#define KALDI_WARN
Definition: kaldi-error.h:150

◆ PruneActiveTokens()

void PruneActiveTokens ( int32  cur_frame,
BaseFloat  delta 
)
inlineprivate

Definition at line 566 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, KALDI_VLOG, LatticeBiglmFasterDecoder::num_toks_, LatticeBiglmFasterDecoder::PruneForwardLinks(), and LatticeBiglmFasterDecoder::PruneTokensForFrame().

Referenced by LatticeBiglmFasterDecoder::Decode().

566  {
567  int32 num_toks_begin = num_toks_;
568  for (int32 frame = cur_frame-1; frame >= 0; frame--) {
569  // Reason why we need to prune forward links in this situation:
570  // (1) we have never pruned them (new TokenList)
571  // (2) we have not yet pruned the forward links to the next frame,
572  // after any of those tokens have changed their extra_cost.
573  if (active_toks_[frame].must_prune_forward_links) {
574  bool extra_costs_changed = false, links_pruned = false;
575  PruneForwardLinks(frame, &extra_costs_changed, &links_pruned, delta);
576  if (extra_costs_changed && frame > 0) // any token has changed extra_cost
577  active_toks_[frame-1].must_prune_forward_links = true;
578  if (links_pruned) // any link was pruned
579  active_toks_[frame].must_prune_tokens = true;
580  active_toks_[frame].must_prune_forward_links = false; // job done
581  }
582  if (frame+1 < cur_frame && // except for last frame (no forward links)
583  active_toks_[frame+1].must_prune_tokens) {
584  PruneTokensForFrame(frame+1);
585  active_toks_[frame+1].must_prune_tokens = false;
586  }
587  }
588  KALDI_VLOG(3) << "PruneActiveTokens: pruned tokens from " << num_toks_begin
589  << " to " << num_toks_;
590  }
void PruneForwardLinks(int32 frame, bool *extra_costs_changed, bool *links_pruned, BaseFloat delta)
kaldi::int32 int32
#define KALDI_VLOG(v)
Definition: kaldi-error.h:156

◆ PruneActiveTokensFinal()

void PruneActiveTokensFinal ( int32  cur_frame)
inlineprivate

Definition at line 594 of file lattice-biglm-faster-decoder.h.

References KALDI_VLOG, LatticeBiglmFasterDecoder::num_toks_, LatticeBiglmFasterDecoder::PruneForwardLinks(), LatticeBiglmFasterDecoder::PruneForwardLinksFinal(), and LatticeBiglmFasterDecoder::PruneTokensForFrame().

Referenced by LatticeBiglmFasterDecoder::Decode().

594  {
595  // returns true if there were final states active
596  // else returns false and treats all states as final while doing the pruning
597  // (this can be useful if you want partial lattice output,
598  // although it can be dangerous, depending what you want the lattices for).
599  // final_active_ and final_probs_ (a hash) are set internally
600  // by PruneForwardLinksFinal
601  int32 num_toks_begin = num_toks_;
602  PruneForwardLinksFinal(cur_frame); // prune final frame (with final-probs)
603  // sets final_active_ and final_probs_
604  for (int32 frame = cur_frame-1; frame >= 0; frame--) {
605  bool b1, b2; // values not used.
606  BaseFloat dontcare = 0.0; // delta of zero means we must always update
607  PruneForwardLinks(frame, &b1, &b2, dontcare);
608  PruneTokensForFrame(frame+1);
609  }
611  KALDI_VLOG(3) << "PruneActiveTokensFinal: pruned tokens from " << num_toks_begin
612  << " to " << num_toks_;
613  }
void PruneForwardLinks(int32 frame, bool *extra_costs_changed, bool *links_pruned, BaseFloat delta)
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_VLOG(v)
Definition: kaldi-error.h:156

◆ PruneForwardLinks()

void PruneForwardLinks ( int32  frame,
bool extra_costs_changed,
bool links_pruned,
BaseFloat  delta 
)
inlineprivate

Definition at line 356 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::ForwardLink::acoustic_cost, LatticeBiglmFasterDecoder::active_toks_, LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::Token::extra_cost, LatticeBiglmFasterDecoder::ForwardLink::graph_cost, KALDI_ASSERT, KALDI_WARN, LatticeFasterDecoderConfig::lattice_beam, LatticeBiglmFasterDecoder::ForwardLink::next, LatticeBiglmFasterDecoder::ForwardLink::next_tok, LatticeBiglmFasterDecoder::Token::tot_cost, and LatticeBiglmFasterDecoder::warned_.

Referenced by LatticeBiglmFasterDecoder::PruneActiveTokens(), and LatticeBiglmFasterDecoder::PruneActiveTokensFinal().

358  {
359  // delta is the amount by which the extra_costs must change
360  // If delta is larger, we'll tend to go back less far
361  // toward the beginning of the file.
362  // extra_costs_changed is set to true if extra_cost was changed for any token
363  // links_pruned is set to true if any link in any token was pruned
364 
365  *extra_costs_changed = false;
366  *links_pruned = false;
367  KALDI_ASSERT(frame >= 0 && frame < active_toks_.size());
368  if (active_toks_[frame].toks == NULL ) { // empty list; should not happen.
369  if (!warned_) {
370  KALDI_WARN << "No tokens alive [doing pruning].. warning first "
371  "time only for each utterance\n";
372  warned_ = true;
373  }
374  }
375 
376  // We have to iterate until there is no more change, because the links
377  // are not guaranteed to be in topological order.
378  bool changed = true; // difference new minus old extra cost >= delta ?
379  while (changed) {
380  changed = false;
381  for (Token *tok = active_toks_[frame].toks; tok != NULL; tok = tok->next) {
382  ForwardLink *link, *prev_link=NULL;
383  // will recompute tok_extra_cost for tok.
384  BaseFloat tok_extra_cost = std::numeric_limits<BaseFloat>::infinity();
385  // tok_extra_cost is the best (min) of link_extra_cost of outgoing links
386  for (link = tok->links; link != NULL; ) {
387  // See if we need to excise this link...
388  Token *next_tok = link->next_tok;
389  BaseFloat link_extra_cost = next_tok->extra_cost +
390  ((tok->tot_cost + link->acoustic_cost + link->graph_cost)
391  - next_tok->tot_cost); // difference in brackets is >= 0
392  // link_exta_cost is the difference in score between the best paths
393  // through link source state and through link destination state
394  KALDI_ASSERT(link_extra_cost == link_extra_cost); // check for NaN
395  if (link_extra_cost > config_.lattice_beam) { // excise link
396  ForwardLink *next_link = link->next;
397  if (prev_link != NULL) prev_link->next = next_link;
398  else tok->links = next_link;
399  delete link;
400  link = next_link; // advance link but leave prev_link the same.
401  *links_pruned = true;
402  } else { // keep the link and update the tok_extra_cost if needed.
403  if (link_extra_cost < 0.0) { // this is just a precaution.
404  if (link_extra_cost < -0.01)
405  KALDI_WARN << "Negative extra_cost: " << link_extra_cost;
406  link_extra_cost = 0.0;
407  }
408  if (link_extra_cost < tok_extra_cost)
409  tok_extra_cost = link_extra_cost;
410  prev_link = link; // move to next link
411  link = link->next;
412  }
413  } // for all outgoing links
414  if (fabs(tok_extra_cost - tok->extra_cost) > delta)
415  changed = true; // difference new minus old is bigger than delta
416  tok->extra_cost = tok_extra_cost;
417  // will be +infinity or <= lattice_beam_.
418  // infinity indicates, that no forward link survived pruning
419  } // for all Token on active_toks_[frame]
420  if (changed) *extra_costs_changed = true;
421 
422  // Note: it's theoretically possible that aggressive compiler
423  // optimizations could cause an infinite loop here for small delta and
424  // high-dynamic-range scores.
425  } // while changed
426  }
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_WARN
Definition: kaldi-error.h:150
LatticeBiglmFasterDecoderConfig config_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ PruneForwardLinksFinal()

void PruneForwardLinksFinal ( int32  frame)
inlineprivate

Definition at line 431 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::ForwardLink::acoustic_cost, LatticeBiglmFasterDecoder::active_toks_, kaldi::ApproxEqual(), LatticeBiglmFasterDecoder::config_, LatticeBiglmFasterDecoder::Token::extra_cost, DeterministicOnDemandFst< Arc >::Final(), LatticeBiglmFasterDecoder::final_active_, LatticeBiglmFasterDecoder::final_costs_, LatticeBiglmFasterDecoder::fst_, LatticeBiglmFasterDecoder::ForwardLink::graph_cost, KALDI_ASSERT, KALDI_WARN, LatticeFasterDecoderConfig::lattice_beam, LatticeBiglmFasterDecoder::lm_diff_fst_, LatticeBiglmFasterDecoder::ForwardLink::next, LatticeBiglmFasterDecoder::ForwardLink::next_tok, LatticeBiglmFasterDecoder::PairToLmState(), LatticeBiglmFasterDecoder::PairToState(), LatticeBiglmFasterDecoder::toks_, and LatticeBiglmFasterDecoder::Token::tot_cost.

Referenced by LatticeBiglmFasterDecoder::PruneActiveTokensFinal().

431  {
432  KALDI_ASSERT(static_cast<size_t>(frame+1) == active_toks_.size());
433  if (active_toks_[frame].toks == NULL ) // empty list; should not happen.
434  KALDI_WARN << "No tokens alive at end of file\n";
435 
436  // First go through, working out the best token (do it in parallel
437  // including final-probs and not including final-probs; we'll take
438  // the one with final-probs if it's valid).
439  const BaseFloat infinity = std::numeric_limits<BaseFloat>::infinity();
440  BaseFloat best_cost_final = infinity,
441  best_cost_nofinal = infinity;
442  unordered_map<Token*, BaseFloat> tok_to_final_cost;
443  Elem *cur_toks = toks_.Clear(); // swapping prev_toks_ / cur_toks_
444  for (Elem *e = cur_toks, *e_tail; e != NULL; e = e_tail) {
445  PairId state_pair = e->key;
446  StateId state = PairToState(state_pair),
447  lm_state = PairToLmState(state_pair);
448  Token *tok = e->val;
449  BaseFloat final_cost = fst_.Final(state).Value() +
450  lm_diff_fst_->Final(lm_state).Value();
451  tok_to_final_cost[tok] = final_cost;
452  best_cost_final = std::min(best_cost_final, tok->tot_cost + final_cost);
453  best_cost_nofinal = std::min(best_cost_nofinal, tok->tot_cost);
454  e_tail = e->tail;
455  toks_.Delete(e);
456  }
457  final_active_ = (best_cost_final != infinity);
458 
459  // Now go through tokens on this frame, pruning forward links... may have
460  // to iterate a few times until there is no more change, because the list is
461  // not in topological order.
462 
463  bool changed = true;
464  BaseFloat delta = 1.0e-05;
465  while (changed) {
466  changed = false;
467  for (Token *tok = active_toks_[frame].toks; tok != NULL; tok = tok->next) {
468  ForwardLink *link, *prev_link=NULL;
469  // will recompute tok_extra_cost. It has a term in it that corresponds
470  // to the "final-prob", so instead of initializing tok_extra_cost to infinity
471  // below we set it to the difference between the (score+final_prob) of this token,
472  // and the best such (score+final_prob).
473  BaseFloat tok_extra_cost;
474  if (final_active_) {
475  BaseFloat final_cost = tok_to_final_cost[tok];
476  tok_extra_cost = (tok->tot_cost + final_cost) - best_cost_final;
477  } else
478  tok_extra_cost = tok->tot_cost - best_cost_nofinal;
479 
480  for (link = tok->links; link != NULL; ) {
481  // See if we need to excise this link...
482  Token *next_tok = link->next_tok;
483  BaseFloat link_extra_cost = next_tok->extra_cost +
484  ((tok->tot_cost + link->acoustic_cost + link->graph_cost)
485  - next_tok->tot_cost);
486  if (link_extra_cost > config_.lattice_beam) { // excise link
487  ForwardLink *next_link = link->next;
488  if (prev_link != NULL) prev_link->next = next_link;
489  else tok->links = next_link;
490  delete link;
491  link = next_link; // advance link but leave prev_link the same.
492  } else { // keep the link and update the tok_extra_cost if needed.
493  if (link_extra_cost < 0.0) { // this is just a precaution.
494  if (link_extra_cost < -0.01)
495  KALDI_WARN << "Negative extra_cost: " << link_extra_cost;
496  link_extra_cost = 0.0;
497  }
498  if (link_extra_cost < tok_extra_cost)
499  tok_extra_cost = link_extra_cost;
500  prev_link = link;
501  link = link->next;
502  }
503  }
504  // prune away tokens worse than lattice_beam above best path. This step
505  // was not necessary in the non-final case because then, this case
506  // showed up as having no forward links. Here, the tok_extra_cost has
507  // an extra component relating to the final-prob.
508  if (tok_extra_cost > config_.lattice_beam)
509  tok_extra_cost = infinity;
510  // to be pruned in PruneTokensForFrame
511 
512  if (!ApproxEqual(tok->extra_cost, tok_extra_cost, delta))
513  changed = true;
514  tok->extra_cost = tok_extra_cost; // will be +infinity or <= lattice_beam_.
515  }
516  } // while changed
517 
518  // Now put surviving Tokens in the final_costs_ hash, which is a class
519  // member (unlike tok_to_final_costs).
520  for (Token *tok = active_toks_[frame].toks; tok != NULL; tok = tok->next) {
521  if (tok->extra_cost != infinity) {
522  // If the token was not pruned away,
523  if (final_active_) {
524  BaseFloat final_cost = tok_to_final_cost[tok];
525  if (final_cost != infinity)
526  final_costs_[tok] = final_cost;
527  } else {
528  final_costs_[tok] = 0;
529  }
530  }
531  }
532  }
fst::DeterministicOnDemandFst< fst::StdArc > * lm_diff_fst_
virtual Weight Final(StateId s)=0
static StateId PairToLmState(PairId state_pair)
const fst::Fst< fst::StdArc > & fst_
HashList< PairId, Token * >::Elem Elem
std::map< Token *, BaseFloat > final_costs_
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_WARN
Definition: kaldi-error.h:150
LatticeBiglmFasterDecoderConfig config_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static StateId PairToState(PairId state_pair)
static bool ApproxEqual(float a, float b, float relative_tolerance=0.001)
return abs(a - b) <= relative_tolerance * (abs(a)+abs(b)).
Definition: kaldi-math.h:265

◆ PruneTokensForFrame()

void PruneTokensForFrame ( int32  frame)
inlineprivate

Definition at line 538 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::active_toks_, LatticeBiglmFasterDecoder::Token::extra_cost, KALDI_ASSERT, KALDI_WARN, LatticeBiglmFasterDecoder::Token::next, LatticeBiglmFasterDecoder::ForwardLink::next_tok, and LatticeBiglmFasterDecoder::num_toks_.

Referenced by LatticeBiglmFasterDecoder::PruneActiveTokens(), and LatticeBiglmFasterDecoder::PruneActiveTokensFinal().

538  {
539  KALDI_ASSERT(frame >= 0 && frame < active_toks_.size());
540  Token *&toks = active_toks_[frame].toks;
541  if (toks == NULL)
542  KALDI_WARN << "No tokens alive [doing pruning]\n";
543  Token *tok, *next_tok, *prev_tok = NULL;
544  for (tok = toks; tok != NULL; tok = next_tok) {
545  next_tok = tok->next;
546  if (tok->extra_cost == std::numeric_limits<BaseFloat>::infinity()) {
547  // token is unreachable from end of graph; (no forward links survived)
548  // excise tok from list and delete tok.
549  if (prev_tok != NULL) prev_tok->next = tok->next;
550  else toks = tok->next;
551  delete tok;
552  num_toks_--;
553  } else { // fetch next Token
554  prev_tok = tok;
555  }
556  }
557  }
#define KALDI_WARN
Definition: kaldi-error.h:150
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ ReachedFinal()

bool ReachedFinal ( ) const
inline

says whether a final-state was active on the last frame.

If it was not, the lattice (or traceback) will end with states that are not final-states.

Definition at line 115 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::final_active_.

Referenced by kaldi::DecodeUtterance().

◆ SetOptions()

void SetOptions ( const LatticeBiglmFasterDecoderConfig config)
inline

Definition at line 68 of file lattice-biglm-faster-decoder.h.

References LatticeBiglmFasterDecoder::config_.

68 { config_ = config; }
LatticeBiglmFasterDecoderConfig config_

Member Data Documentation

◆ active_toks_

◆ config_

◆ final_active_

◆ final_costs_

◆ fst_

◆ lm_diff_fst_

◆ num_toks_

◆ queue_

std::vector<const Elem* > queue_
private

◆ tmp_array_

std::vector<BaseFloat> tmp_array_
private

◆ toks_

◆ warned_

◆ warned_noarc_

bool warned_noarc_
private

The documentation for this class was generated from the following file: