Accumulate tree statistics for decision tree training.
The program reads in a feature archive, and the corresponding alignments, and generates the sufficient statistics for the decision tree creation. Context width and central phone position are used to identify the contexts.Transition model is used as an input to identify the PDF's and the phones.
35 using namespace kaldi;
39 "Accumulate statistics for phonetic-context tree building.\n" 40 "Usage: acc-tree-stats [options] <model-in> <features-rspecifier> <alignments-rspecifier> <tree-accs-out>\n" 42 " acc-tree-stats 1.mdl scp:train.scp ark:1.ali 1.tacc\n";
47 po.Register(
"binary", &binary,
"Write output in binary mode");
52 if (po.NumArgs() != 4) {
57 std::string model_filename = po.GetArg(1),
58 feature_rspecifier = po.GetArg(2),
59 alignment_rspecifier = po.GetArg(3),
60 accs_out_wxfilename = po.GetOptArg(4);
68 Input ki(model_filename, &binary);
69 trans_model.
Read(ki.Stream(), binary);
76 std::map<EventType, GaussClusterable*> tree_stats;
78 int num_done = 0, num_no_alignment = 0, num_other_error = 0;
80 for (; !feature_reader.Done(); feature_reader.Next()) {
81 std::string key = feature_reader.Key();
82 if (!alignment_reader.HasKey(key)) {
86 const std::vector<int32> &alignment = alignment_reader.Value(key);
88 if (alignment.size() != mat.
NumRows()) {
89 KALDI_WARN <<
"Alignments has wrong size "<< (alignment.size())<<
" vs. "<< (mat.
NumRows());
100 if (num_done % 1000 == 0)
101 KALDI_LOG <<
"Processed " << num_done <<
" utterances.";
107 for (std::map<EventType, GaussClusterable*>::const_iterator iter = tree_stats.begin();
108 iter != tree_stats.end();
110 stats.push_back(std::make_pair(iter->first, iter->second));
115 Output ko(accs_out_wxfilename, binary);
118 KALDI_LOG <<
"Accumulated stats for " << num_done <<
" files, " 119 << num_no_alignment <<
" failed due to no alignment, " 120 << num_other_error <<
" failed for other reasons.";
121 KALDI_LOG <<
"Number of separate stats (context-dependent states) is " 124 if (num_done != 0)
return 0;
126 }
catch(
const std::exception &e) {
127 std::cerr << e.what();
This code computes Goodness of Pronunciation (GOP) and extracts phone-level pronunciation feature for...
void Register(OptionsItf *opts)
void AccumulateTreeStats(const TransitionModel &trans_model, const AccumulateTreeStatsInfo &info, const std::vector< int32 > &alignment, const Matrix< BaseFloat > &features, std::map< EventType, GaussClusterable *> *stats)
Accumulates the stats needed for training context-dependency trees (in the "normal" way)...
void DeleteBuildTreeStats(BuildTreeStatsType *stats)
This frees the Clusterable* pointers in "stats", where non-NULL, and sets them to NULL...
Allows random access to a collection of objects in an archive or script file; see The Table concept...
The class ParseOptions is for parsing command-line options; see Parsing command-line options for more...
void Read(std::istream &is, bool binary)
A templated class for reading objects sequentially from an archive or script file; see The Table conc...
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
std::vector< std::pair< EventType, Clusterable * > > BuildTreeStatsType
void WriteBuildTreeStats(std::ostream &os, bool binary, const BuildTreeStatsType &stats)
Writes BuildTreeStats object. This works even if pointers are NULL.