This class is used inside LatticeIncrementalDecoderTpl; it handles some of the details of incremental determinization. More...
#include <lattice-incremental-decoder.h>
Public Types | |
enum | { kStateLabelOffset = (int)1e8, kTokenLabelOffset = (int)2e8, kMaxTokenLabel = (int)3e8 } |
using | Label = typename LatticeArc::Label |
Public Member Functions | |
LatticeIncrementalDeterminizer (const TransitionModel &trans_model, const LatticeIncrementalDecoderConfig &config) | |
void | Init () |
const CompactLattice & | GetDeterminizedLattice () const |
void | InitializeRawLatticeChunk (Lattice *olat, unordered_map< Label, LatticeArc::StateId > *token_label2state) |
Starts the process of creating a raw lattice chunk. More... | |
bool | AcceptRawLatticeChunk (Lattice *raw_fst) |
This function accepts the raw FST (state-level lattice) corresponding to a single chunk of the lattice, determinizes it and appends it to this->clat_. More... | |
void | SetFinalCosts (const unordered_map< Label, BaseFloat > *token_label2final_cost=NULL) |
const CompactLattice & | GetLattice () |
Private Member Functions | |
void | GetRawLatticeFinalCosts (const Lattice &raw_fst, std::unordered_map< Label, BaseFloat > *old_final_costs) |
void | GetNonFinalRedetStates () |
bool | ProcessArcsFromChunkStartState (const CompactLattice &chunk_clat, std::unordered_map< CompactLattice::StateId, CompactLattice::StateId > *state_map) |
[called from AcceptRawLatticeChunk()] Processes arcs that leave the start-state of `chunk_clat` (if this is not the first chunk); does nothing if this is the first chunk. More... | |
void | TransferArcsToClat (const CompactLattice &chunk_clat, bool is_first_chunk, const std::unordered_map< CompactLattice::StateId, CompactLattice::StateId > &state_map, const std::unordered_map< CompactLattice::StateId, Label > &chunk_state_to_token, const std::unordered_map< Label, BaseFloat > &old_final_costs) |
This function, called from AcceptRawLatticeChunk(), transfers arcs from `chunk_clat` to clat_. More... | |
void | AddArcToClat (CompactLattice::StateId state, const CompactLatticeArc &arc) |
Adds one arc to `clat_`. More... | |
CompactLattice::StateId | AddStateToClat () |
void | IdentifyTokenFinalStates (const CompactLattice &chunk_clat, std::unordered_map< CompactLattice::StateId, CompactLatticeArc::Label > *token_map) const |
KALDI_DISALLOW_COPY_AND_ASSIGN (LatticeIncrementalDeterminizer) | |
Private Attributes | |
const TransitionModel & | trans_model_ |
const LatticeIncrementalDecoderConfig & | config_ |
std::unordered_set< CompactLattice::StateId > | non_final_redet_states_ |
CompactLattice | clat_ |
std::vector< std::vector< std::pair< CompactLattice::StateId, int32 > > > | arcs_in_ |
std::vector< CompactLatticeArc > | final_arcs_ |
std::vector< BaseFloat > | forward_costs_ |
std::unordered_set< int32 > | temp_ |
This class is used inside LatticeIncrementalDecoderTpl; it handles some of the details of incremental determinization.
https://www.danielpovey.com/files/ *TBD*.pdf for the paper.
Definition at line 196 of file lattice-incremental-decoder.h.
using Label = typename LatticeArc::Label |
Definition at line 198 of file lattice-incremental-decoder.h.
anonymous enum |
Enumerator | |
---|---|
kStateLabelOffset | |
kTokenLabelOffset | |
kMaxTokenLabel |
Definition at line 290 of file lattice-incremental-decoder.h.
|
inline |
Definition at line 203 of file lattice-incremental-decoder.h.
This function accepts the raw FST (state-level lattice) corresponding to a single chunk of the lattice, determinizes it and appends it to this->clat_.
Unless this was the
Note: final-probs in `raw_fst` are treated specially: they are used to guide the pruned determinization, but when you call GetLattice() it will be – except for pruning effects– as if all nonzero final-probs in `raw_fst` were: One() if final_costs == NULL; else the value present in `final_costs`.
[in] | raw_fst | (Consumed destructively). The input raw (state-level) lattice. Would correspond to the FST A in the paper if first_frame == 0, and B otherwise. |
NOTE: if this is not the final chunk, you will probably want to call SetFinalCosts() directly after calling this.
Definition at line 1547 of file lattice-incremental-decoder.cc.
References fst::DeterminizeLatticePhonePrunedWrapper(), KALDI_ASSERT, KALDI_WARN, and kaldi::TopSortCompactLatticeIfNeeded().
|
private |
Adds one arc to `clat_`.
It's like clat_.AddArc(state, arc), except it also modifies arcs_in_ and forward_costs_.
Definition at line 1150 of file lattice-incremental-decoder.cc.
References fst::ConvertToCost().
|
private |
Definition at line 1142 of file lattice-incremental-decoder.cc.
References KALDI_ASSERT.
|
inline |
Definition at line 212 of file lattice-incremental-decoder.h.
|
inline |
Definition at line 284 of file lattice-incremental-decoder.h.
|
private |
Definition at line 1190 of file lattice-incremental-decoder.cc.
|
private |
Definition at line 1329 of file lattice-incremental-decoder.cc.
References KALDI_ERR, LatticeWeightTpl< FloatType >::Value1(), and LatticeWeightTpl< FloatType >::Value2().
|
private |
Definition at line 1165 of file lattice-incremental-decoder.cc.
References KALDI_ASSERT.
void Init | ( | ) |
Definition at line 1134 of file lattice-incremental-decoder.cc.
void InitializeRawLatticeChunk | ( | Lattice * | olat, |
unordered_map< Label, LatticeArc::StateId > * | token_label2state | ||
) |
Starts the process of creating a raw lattice chunk.
(Search the glossary for "raw lattice chunk"). This just sets up the initial states and redeterminized-states in the chunk. Relates to sec. 5.2 in the paper, specifically the initial-state i and the redeterminized-states.
After calling this, the caller would add the remaining arcs and states to `olat` and then call AcceptRawLatticeChunk() with the result.
[out] | olat | The lattice to be (partially) created |
[out] | token_label2state | This function outputs to here a map from `token-label` to the state we created for it in *olat. See glossary for `token-label`. The keys actually correspond to the .nextstate fields in the arcs in final_arcs_; values are states in `olat`. See the last bullet point before Sec. 5.3 in the paper. |
Definition at line 1223 of file lattice-incremental-decoder.cc.
References kaldi::AddCompactLatticeArcToLattice(), and KALDI_ASSERT.
|
private |
|
private |
[called from AcceptRawLatticeChunk()] Processes arcs that leave the start-state of `chunk_clat` (if this is not the first chunk); does nothing if this is the first chunk.
This includes using the `state-labels` to work out which states in clat_ these states correspond to, and writing that mapping to `state_map`.
Also modifies forward_costs_, because it has to do a kind of reweighting of the clat states that are the values it puts in `state_map`, to take account of the probabilities on the arcs from the start state of chunk_clat to the states corresponding to those redeterminized-states (i.e. the states in clat corresponding to the values it puts in `*state_map`). It also modifies arcs_in_, mostly because there are rare cases when we end up `merging` sets of those redeterminized-states, because the determinization process mapped them to a single state, and that means we need to reroute the arcs into members of that set into one single member (which will appear as a value in `*state_map`).
[in] | chunk_clat | The determinized chunk of lattice we are processing |
[out] | state_map | Mapping from states in chunk_clat to the state in clat_ they correspond to. |
Definition at line 1363 of file lattice-incremental-decoder.cc.
References fst::ConvertToCost(), KALDI_ASSERT, CompactLatticeWeightTpl< WeightType, IntType >::SetWeight(), fst::Times(), and CompactLatticeWeightTpl< WeightType, IntType >::Weight().
Definition at line 1645 of file lattice-incremental-decoder.cc.
References KALDI_WARN, fst::Plus(), and fst::Times().
|
private |
This function, called from AcceptRawLatticeChunk(), transfers arcs from `chunk_clat` to clat_.
For those arcs that have `token-labels` on them, they don't get written to clat_ but instead are stored in the arcs_ array.
[in] | chunk_clat | The determinized lattice for the chunk we are processing; this is the source of the arcs we are moving. |
[in] | is_first_chunk | True if this is the first chunk in the utterance; it's needed because if it is, we will also transfer arcs from the start state of chunk_clat. |
[in] | state_map | Map from state-ids in chunk_clat to state-ids in clat_. |
[in] | chunk_state_to_token | Map from `token-final states` (see glossary) in chunk_clat, to the token-label on arcs entering those states. |
[in] | old_final_costs | Map from token-label to the final-costs that were on the corresponding token-final states in the undeterminized lattice; these final-costs need to be removed when we record the weights in final_arcs_, because they were just temporary. |
Definition at line 1464 of file lattice-incremental-decoder.cc.
References KALDI_ASSERT, and fst::Times().
|
private |
Definition at line 420 of file lattice-incremental-decoder.h.
|
private |
Definition at line 411 of file lattice-incremental-decoder.h.
|
private |
Definition at line 395 of file lattice-incremental-decoder.h.
|
private |
Definition at line 429 of file lattice-incremental-decoder.h.
|
private |
Definition at line 435 of file lattice-incremental-decoder.h.
|
private |
Definition at line 403 of file lattice-incremental-decoder.h.
|
private |
Definition at line 438 of file lattice-incremental-decoder.h.
|
private |
Definition at line 392 of file lattice-incremental-decoder.h.