This class does the word-level Minimum Bayes Risk computation, and gives you either the 1-best MBR output together with the expected Bayes Risk, or a sausage-like structure. More...
#include <sausages.h>
Classes | |
struct | Arc |
struct | GammaCompare |
Public Member Functions | |
MinimumBayesRisk (const CompactLattice &clat, MinimumBayesRiskOptions opts=MinimumBayesRiskOptions()) | |
Initialize with compact lattice– any acoustic scaling etc., is assumed to have been done already. More... | |
MinimumBayesRisk (const CompactLattice &clat, const std::vector< int32 > &words, MinimumBayesRiskOptions opts=MinimumBayesRiskOptions()) | |
MinimumBayesRisk (const CompactLattice &clat, const std::vector< int32 > &words, const std::vector< std::pair< BaseFloat, BaseFloat > > ×, MinimumBayesRiskOptions opts=MinimumBayesRiskOptions()) | |
const std::vector< int32 > & | GetOneBest () const |
const std::vector< std::vector< std::pair< BaseFloat, BaseFloat > > > | GetTimes () const |
const std::vector< std::pair< BaseFloat, BaseFloat > > | GetSausageTimes () const |
const std::vector< std::pair< BaseFloat, BaseFloat > > & | GetOneBestTimes () const |
const std::vector< BaseFloat > & | GetOneBestConfidences () const |
Outputs the confidences for the one-best transcript. More... | |
BaseFloat | GetBayesRisk () const |
Returns the expected WER over this sentence (assuming model correctness). More... | |
const std::vector< std::vector< std::pair< int32, BaseFloat > > > & | GetSausageStats () const |
Private Member Functions | |
void | PrepareLatticeAndInitStats (CompactLattice *clat) |
void | MbrDecode () |
Minimum-Bayes-Risk Decode. Top-level algorithm. Figure 6 of the paper. More... | |
double | l (int32 a, int32 b, bool penalize=false) |
Without the 'penalize' argument this gives us the basic edit-distance function l(a,b), as in the paper. More... | |
int32 | r (int32 q) |
returns r_q, in one-based indexing, as in the paper. More... | |
double | EditDistance (int32 N, int32 Q, Vector< double > &alpha, Matrix< double > &alpha_dash, Vector< double > &alpha_dash_arc) |
Figure 4 of the paper; called from AccStats (Fig. 5) More... | |
void | AccStats () |
Figure 5 of the paper. Outputs to gamma_ and L_. More... | |
Static Private Member Functions | |
static void | RemoveEps (std::vector< int32 > *vec) |
Removes epsilons (symbol 0) from a vector. More... | |
static void | NormalizeEps (std::vector< int32 > *vec) |
static BaseFloat | delta () |
static void | AddToMap (int32 i, double d, std::map< int32, double > *gamma) |
Function used to increment map. More... | |
Private Attributes | |
MinimumBayesRiskOptions | opts_ |
std::vector< Arc > | arcs_ |
Arcs in the topologically sorted acceptor form of the word-level lattice, with one final-state. More... | |
std::vector< std::vector< int32 > > | pre_ |
For each node in the lattice, a list of arcs entering that node. More... | |
std::vector< int32 > | state_times_ |
std::vector< int32 > | R_ |
double | L_ |
std::vector< std::vector< std::pair< int32, BaseFloat > > > | gamma_ |
std::vector< std::vector< std::pair< BaseFloat, BaseFloat > > > | times_ |
std::vector< std::pair< BaseFloat, BaseFloat > > | sausage_times_ |
std::vector< std::pair< BaseFloat, BaseFloat > > | one_best_times_ |
std::vector< BaseFloat > | one_best_confidences_ |
This class does the word-level Minimum Bayes Risk computation, and gives you either the 1-best MBR output together with the expected Bayes Risk, or a sausage-like structure.
Definition at line 77 of file sausages.h.
MinimumBayesRisk | ( | const CompactLattice & | clat, |
MinimumBayesRiskOptions | opts = MinimumBayesRiskOptions() |
||
) |
Initialize with compact lattice– any acoustic scaling etc., is assumed to have been done already.
This does the whole computation. You get the output with GetOneBest(), GetBayesRisk(), and GetSausageStats().
Definition at line 369 of file sausages.cc.
References fst::ConvertLattice(), fst::GetLinearSymbolSequence(), KALDI_ASSERT, MinimumBayesRisk::L_, MinimumBayesRisk::MbrDecode(), MinimumBayesRisk::PrepareLatticeAndInitStats(), MinimumBayesRisk::R_, fst::RemoveAlignmentsFromCompactLattice(), and words.
MinimumBayesRisk | ( | const CompactLattice & | clat, |
const std::vector< int32 > & | words, | ||
MinimumBayesRiskOptions | opts = MinimumBayesRiskOptions() |
||
) |
Definition at line 403 of file sausages.cc.
References MinimumBayesRisk::L_, MinimumBayesRisk::MbrDecode(), MinimumBayesRisk::PrepareLatticeAndInitStats(), MinimumBayesRisk::R_, and words.
MinimumBayesRisk | ( | const CompactLattice & | clat, |
const std::vector< int32 > & | words, | ||
const std::vector< std::pair< BaseFloat, BaseFloat > > & | times, | ||
MinimumBayesRiskOptions | opts = MinimumBayesRiskOptions() |
||
) |
Definition at line 416 of file sausages.cc.
References MinimumBayesRisk::L_, MinimumBayesRisk::MbrDecode(), MinimumBayesRisk::PrepareLatticeAndInitStats(), MinimumBayesRisk::R_, MinimumBayesRisk::sausage_times_, and words.
|
private |
Figure 5 of the paper. Outputs to gamma_ and L_.
Definition at line 170 of file sausages.cc.
References MinimumBayesRisk::AddToMap(), MinimumBayesRisk::arcs_, MinimumBayesRisk::EditDistance(), kaldi::Exp(), MinimumBayesRisk::gamma_, rnnlm::i, KALDI_ERR, KALDI_VLOG, KALDI_WARN, MinimumBayesRisk::l(), MinimumBayesRisk::L_, MinimumBayesRisk::Arc::loglike, rnnlm::n, MinimumBayesRisk::pre_, MinimumBayesRisk::r(), MinimumBayesRisk::R_, MinimumBayesRisk::sausage_times_, VectorBase< Real >::SetZero(), MinimumBayesRisk::Arc::start_node, MinimumBayesRisk::state_times_, MinimumBayesRisk::times_, and MinimumBayesRisk::Arc::word.
Referenced by MinimumBayesRisk::MbrDecode().
Function used to increment map.
Definition at line 192 of file sausages.h.
References rnnlm::d.
Referenced by MinimumBayesRisk::AccStats().
|
inlinestaticprivate |
Definition at line 188 of file sausages.h.
|
private |
Figure 4 of the paper; called from AccStats (Fig. 5)
Definition at line 130 of file sausages.cc.
References MinimumBayesRisk::arcs_, kaldi::Exp(), rnnlm::i, kaldi::kLogZeroDouble, MinimumBayesRisk::l(), kaldi::LogAdd(), MinimumBayesRisk::Arc::loglike, rnnlm::n, MinimumBayesRisk::pre_, MinimumBayesRisk::r(), MinimumBayesRisk::Arc::start_node, and MinimumBayesRisk::Arc::word.
Referenced by MinimumBayesRisk::AccStats().
|
inline |
Returns the expected WER over this sentence (assuming model correctness).
Definition at line 137 of file sausages.h.
Referenced by main().
|
inline |
|
inline |
Outputs the confidences for the one-best transcript.
Definition at line 132 of file sausages.h.
Referenced by main().
Definition at line 122 of file sausages.h.
Referenced by main().
Definition at line 139 of file sausages.h.
Referenced by main().
Definition at line 114 of file sausages.h.
Referenced by main().
Definition at line 108 of file sausages.h.
Without the 'penalize' argument this gives us the basic edit-distance function l(a,b), as in the paper.
With the 'penalize' argument it can be interpreted as the edit distance plus the 'delta' from the paper, except that we make a kind of conceptual bug-fix and only apply the delta if the edit-distance was not already zero. This bug-fix was necessary in order to force all the stats to show up, that should show up, and applying the bug-fix makes the sausage stats significantly less sparse.
Definition at line 157 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), and MinimumBayesRisk::EditDistance().
|
private |
Minimum-Bayes-Risk Decode. Top-level algorithm. Figure 6 of the paper.
Definition at line 28 of file sausages.cc.
References MinimumBayesRisk::AccStats(), MinimumBayesRiskOptions::decode_mbr, MinimumBayesRisk::gamma_, rnnlm::i, rnnlm::j, KALDI_VLOG, KALDI_WARN, MinimumBayesRisk::NormalizeEps(), MinimumBayesRisk::one_best_confidences_, MinimumBayesRisk::one_best_times_, MinimumBayesRisk::opts_, MinimumBayesRiskOptions::print_silence, MinimumBayesRisk::R_, MinimumBayesRisk::RemoveEps(), and MinimumBayesRisk::times_.
Referenced by MinimumBayesRisk::MinimumBayesRisk().
|
staticprivate |
Definition at line 119 of file sausages.cc.
References rnnlm::i, and MinimumBayesRisk::RemoveEps().
Referenced by MinimumBayesRisk::MbrDecode().
|
private |
Definition at line 320 of file sausages.cc.
References MinimumBayesRisk::arcs_, kaldi::CompactLatticeStateTimes(), fst::CreateSuperFinal(), MinimumBayesRisk::Arc::end_node, rnnlm::i, KALDI_ASSERT, KALDI_ERR, MinimumBayesRisk::Arc::loglike, rnnlm::n, MinimumBayesRisk::pre_, MinimumBayesRisk::Arc::start_node, MinimumBayesRisk::state_times_, and MinimumBayesRisk::Arc::word.
Referenced by MinimumBayesRisk::MinimumBayesRisk().
returns r_q, in one-based indexing, as in the paper.
Definition at line 163 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), and MinimumBayesRisk::EditDistance().
|
staticprivate |
Removes epsilons (symbol 0) from a vector.
Definition at line 112 of file sausages.cc.
Referenced by MinimumBayesRisk::MbrDecode(), and MinimumBayesRisk::NormalizeEps().
|
private |
Arcs in the topologically sorted acceptor form of the word-level lattice, with one final-state.
Contains (word-symbol, log-likelihood on arc == negated cost). Indexed from zero.
Definition at line 213 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), MinimumBayesRisk::EditDistance(), and MinimumBayesRisk::PrepareLatticeAndInitStats().
Definition at line 229 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), and MinimumBayesRisk::MbrDecode().
|
private |
Definition at line 226 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), and MinimumBayesRisk::MinimumBayesRisk().
|
private |
Definition at line 250 of file sausages.h.
Referenced by MinimumBayesRisk::MbrDecode().
Definition at line 245 of file sausages.h.
Referenced by MinimumBayesRisk::MbrDecode().
|
private |
Definition at line 207 of file sausages.h.
Referenced by MinimumBayesRisk::MbrDecode().
|
private |
For each node in the lattice, a list of arcs entering that node.
Indexed from 1 (first node == 1).
Definition at line 217 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), MinimumBayesRisk::EditDistance(), and MinimumBayesRisk::PrepareLatticeAndInitStats().
|
private |
Definition at line 222 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), MinimumBayesRisk::MbrDecode(), and MinimumBayesRisk::MinimumBayesRisk().
Definition at line 240 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), and MinimumBayesRisk::MinimumBayesRisk().
|
private |
Definition at line 219 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), and MinimumBayesRisk::PrepareLatticeAndInitStats().
Definition at line 235 of file sausages.h.
Referenced by MinimumBayesRisk::AccStats(), and MinimumBayesRisk::MbrDecode().