This class extracts some information from the lexicon and stores it in a suitable form for the word-alignment code to use. More...
#include <word-align-lattice-lexicon.h>
Public Member Functions | |
WordAlignLatticeLexiconInfo (const std::vector< std::vector< int32 > > &lexicon) | |
bool | IsValidEntry (const std::vector< int32 > &entry) const |
Returns true if this lexicon-entry can appear, intepreted as (output-word phone1 phone2 ...). More... | |
int32 | EquivalenceClassOf (int32 word) const |
Purely for the testing code, we map words into equivalence classes derived from the mappings in the first two fields of each line in the lexicon. More... | |
Protected Types | |
typedef unordered_map< std::vector< int32 >, std::vector< int32 >, VectorHasher< int32 > > | ViabilityMap |
The type ViabilityMap maps from sequences of phones (excluding the empty sequence), to the sets of all word-labels [on the input lattice] that could correspond to phone sequences that start with s [but are longer than s]. More... | |
typedef unordered_map< std::vector< int32 >, int32, VectorHasher< int32 > > | LexiconMap |
This is a map from a vector (orig-word-symbol phone1 phone2 ... More... | |
typedef unordered_map< int32, std::pair< int32, int32 > > | NumPhonesMap |
This is a map from the word-id (as present in the original lattice) to the minimum and maximum #phones of lexicon entries for that word. More... | |
typedef unordered_map< int32, int32 > | EquivalenceMap |
This is used only in testing code; it defines a mapping from a word to the primary member of that word's equivalence-class. More... | |
Protected Member Functions | |
void | UpdateViabilityMap (const std::vector< int32 > &lexicon_entry) |
void | UpdateLexiconMap (const std::vector< int32 > &lexicon_entry) |
Update the map from a vector (orig-word-symbol phone1 phone2 ... More... | |
void | UpdateNumPhonesMap (const std::vector< int32 > &lexicon_entry) |
void | UpdateEquivalenceMap (const std::vector< std::vector< int32 > > &lexicon) |
void | FinalizeViabilityMap () |
Protected Attributes | |
LexiconMap | lexicon_map_ |
NumPhonesMap | num_phones_map_ |
ViabilityMap | viability_map_ |
LexiconMap | reverse_lexicon_map_ |
EquivalenceMap | equivalence_map_ |
Friends | |
class | LatticeLexiconWordAligner |
This class extracts some information from the lexicon and stores it in a suitable form for the word-alignment code to use.
Definition at line 56 of file word-align-lattice-lexicon.h.
|
protected |
This is used only in testing code; it defines a mapping from a word to the primary member of that word's equivalence-class.
Definition at line 101 of file word-align-lattice-lexicon.h.
|
protected |
This is a map from a vector (orig-word-symbol phone1 phone2 ...
) to the new word-symbol. [todo: make sure the new word-symbol is always nonzero.]
Definition at line 92 of file word-align-lattice-lexicon.h.
|
protected |
This is a map from the word-id (as present in the original lattice) to the minimum and maximum #phones of lexicon entries for that word.
It helps improve efficiency.
Definition at line 97 of file word-align-lattice-lexicon.h.
|
protected |
The type ViabilityMap maps from sequences of phones (excluding the empty sequence), to the sets of all word-labels [on the input lattice] that could correspond to phone sequences that start with s [but are longer than s].
The sets of word-labels are represented as sorted vectors of int32 Note: the zero word-label is included here. This is used in a kind of co-accessibility test, to see whether it is worth extending this state by traversing arcs in the input lattice.
Definition at line 87 of file word-align-lattice-lexicon.h.
WordAlignLatticeLexiconInfo | ( | const std::vector< std::vector< int32 > > & | lexicon | ) |
Definition at line 896 of file word-align-lattice-lexicon.cc.
References rnnlm::i, and KALDI_ASSERT.
Purely for the testing code, we map words into equivalence classes derived from the mappings in the first two fields of each line in the lexicon.
This function maps from each word-id to the lowest member of its equivalence class.
Definition at line 866 of file word-align-lattice-lexicon.cc.
Referenced by kaldi::MapSymbols().
|
protected |
Definition at line 791 of file word-align-lattice-lexicon.cc.
References KALDI_ASSERT, kaldi::SortAndUniq(), and words.
Returns true if this lexicon-entry can appear, intepreted as (output-word phone1 phone2 ...).
Entry contains new-word-id phone1 phone2 ...
This is just used in testing code.
equivalent to all but the 1st entry on a line of the input file.
Definition at line 853 of file word-align-lattice-lexicon.cc.
References KALDI_ASSERT.
Referenced by kaldi::IsPlausibleWord().
|
protected |
Definition at line 873 of file word-align-lattice-lexicon.cc.
References rnnlm::i, KALDI_ASSERT, kaldi::SortAndUniq(), and kaldi::swap().
|
protected |
Update the map from a vector (orig-word-symbol phone1 phone2 ...
) to the new word-symbol. The new word-symbol must always be nonzero; we'll replace it with kTemporaryEpsilon = -2, if it was zero.
Definition at line 804 of file word-align-lattice-lexicon.cc.
References KALDI_ASSERT, KALDI_ERR, KALDI_WARN, and kaldi::kTemporaryEpsilon.
|
protected |
Definition at line 836 of file word-align-lattice-lexicon.cc.
References KALDI_ERR.
|
protected |
Definition at line 774 of file word-align-lattice-lexicon.cc.
References rnnlm::n.
|
friend |
Definition at line 69 of file word-align-lattice-lexicon.h.
|
protected |
Definition at line 116 of file word-align-lattice-lexicon.h.
|
protected |
Definition at line 105 of file word-align-lattice-lexicon.h.
Referenced by LatticeLexiconWordAligner::ProcessEpsilonTransitions(), and LatticeLexiconWordAligner::ProcessWordTransitions().
|
protected |
Definition at line 106 of file word-align-lattice-lexicon.h.
Referenced by LatticeLexiconWordAligner::ProcessEpsilonTransitions(), and LatticeLexiconWordAligner::ProcessWordTransitions().
|
protected |
Definition at line 111 of file word-align-lattice-lexicon.h.
|
protected |
Definition at line 107 of file word-align-lattice-lexicon.h.
Referenced by LatticeLexiconWordAligner::PossiblyAdvanceArc().