The implementation of the Minimum Bayes Risk decoding method described in "Minimum Bayes Risk decoding and system combination based on a recursion for edit distance", Haihua Xu, Daniel Povey, Lidia Mangu and Jie Zhu, Computer Speech and Language, 2011 This is a slightly more principled way to do Minimum Bayes Risk (MBR) decoding than the standard "Confusion Network" method. More...
#include <sausages.h>
Public Member Functions | |
MinimumBayesRiskOptions () | |
void | Register (OptionsItf *opts) |
Public Attributes | |
bool | decode_mbr |
Boolean configuration parameter: if true, we actually update the hypothesis to do MBR decoding (if false, our output is the MAP decoded output, but we output the stats too (i.e. More... | |
bool | print_silence |
Boolean configuration parameter: if true, the 1-best path will 'keep' the <eps> bins,. More... | |
The implementation of the Minimum Bayes Risk decoding method described in "Minimum Bayes Risk decoding and system combination based on a recursion for edit distance", Haihua Xu, Daniel Povey, Lidia Mangu and Jie Zhu, Computer Speech and Language, 2011 This is a slightly more principled way to do Minimum Bayes Risk (MBR) decoding than the standard "Confusion Network" method.
Note: MBR decoding aims to minimize the expected word error rate, assuming the lattice encodes the true uncertainty about what was spoken; standard Viterbi decoding gives the most likely utterance, which corresponds to minimizing the expected sentence error rate.
In addition to giving the MBR output, we also provide a way to get a "Confusion Network" or informally "sausage"-like structure. This is a linear sequence of bins, and in each bin, there is a distribution over words (or epsilon, meaning no word). This is useful for estimating confidence. Note: due to the way these sausages are made, typically there will be, between each bin representing a high-confidence word, a bin in which epsilon (no word) is the most likely word. Inside these bins is where we put possible insertions.
Definition at line 56 of file sausages.h.
|
inline |
Definition at line 64 of file sausages.h.
|
inline |
Definition at line 66 of file sausages.h.
References OptionsItf::Register().
Referenced by main().
bool decode_mbr |
Boolean configuration parameter: if true, we actually update the hypothesis to do MBR decoding (if false, our output is the MAP decoded output, but we output the stats too (i.e.
the confidences)).
Definition at line 60 of file sausages.h.
Referenced by MinimumBayesRisk::MbrDecode().
bool print_silence |
Boolean configuration parameter: if true, the 1-best path will 'keep' the <eps> bins,.
Definition at line 62 of file sausages.h.
Referenced by main(), and MinimumBayesRisk::MbrDecode().