gmm-init-model.cc File Reference
Include dependency graph for gmm-init-model.cc:

Go to the source code of this file.

Namespaces

 kaldi
 This code computes Goodness of Pronunciation (GOP) and extracts phone-level pronunciation feature for mispronunciations detection tasks, the reference:
 

Functions

void InitAmGmm (const BuildTreeStatsType &stats, const EventMap &to_pdf_map, AmDiagGmm *am_gmm, const TransitionModel &trans_model, BaseFloat var_floor)
 InitAmGmm initializes the GMM with one Gaussian per state. More...
 
void GetOccs (const BuildTreeStatsType &stats, const EventMap &to_pdf_map, Vector< BaseFloat > *occs)
 Get state occupation counts. More...
 
void InitAmGmmFromOld (const BuildTreeStatsType &stats, const EventMap &to_pdf_map, int32 N, int32 P, const std::string &old_tree_rxfilename, const std::string &old_model_rxfilename, BaseFloat var_floor, AmDiagGmm *am_gmm)
 InitAmGmmFromOld initializes the GMM based on a previously trained model and tree, which must require no more phonetic context than the current tree. More...
 
int main (int argc, char *argv[])
 

Function Documentation

◆ main()

int main ( int  argc,
char *  argv[] 
)

Definition at line 220 of file gmm-init-model.cc.

References ContextDependency::CentralPosition(), ContextDependency::ContextWidth(), kaldi::DeleteBuildTreeStats(), ParseOptions::GetArg(), kaldi::GetOccs(), ParseOptions::GetOptArg(), kaldi::InitAmGmm(), kaldi::InitAmGmmFromOld(), KALDI_LOG, ParseOptions::NumArgs(), ParseOptions::PrintUsage(), ParseOptions::Read(), kaldi::ReadBuildTreeStats(), kaldi::ReadKaldiObject(), ParseOptions::Register(), Output::Stream(), Input::Stream(), ContextDependency::ToPdfMap(), AmDiagGmm::Write(), TransitionModel::Write(), and VectorBase< Real >::Write().

220  {
221  using namespace kaldi;
222  try {
223  using namespace kaldi;
224  typedef kaldi::int32 int32;
225 
226  const char *usage =
227  "Initialize GMM from decision tree and tree stats\n"
228  "Usage: gmm-init-model [options] <tree-in> <tree-stats-in> <topo-file> <model-out> [<old-tree> <old-model>]\n"
229  "e.g.: \n"
230  " gmm-init-model tree treeacc topo 1.mdl\n"
231  "or (initializing GMMs with old model):\n"
232  " gmm-init-model tree treeacc topo 1.mdl prev/tree prev/30.mdl\n";
233 
234  bool binary = true;
235  double var_floor = 0.01;
236  std::string occs_out_filename;
237 
238 
239  ParseOptions po(usage);
240  po.Register("binary", &binary, "Write output in binary mode");
241  po.Register("write-occs", &occs_out_filename, "File to write state "
242  "occupancies to.");
243  po.Register("var-floor", &var_floor, "Variance floor used while "
244  "initializing Gaussians");
245 
246  po.Read(argc, argv);
247 
248  if (po.NumArgs() != 4 && po.NumArgs() != 6) {
249  po.PrintUsage();
250  exit(1);
251  }
252 
253  std::string
254  tree_filename = po.GetArg(1),
255  stats_filename = po.GetArg(2),
256  topo_filename = po.GetArg(3),
257  model_out_filename = po.GetArg(4),
258  old_tree_filename = po.GetOptArg(5),
259  old_model_filename = po.GetOptArg(6);
260 
261  ContextDependency ctx_dep;
262  ReadKaldiObject(tree_filename, &ctx_dep);
263 
264  BuildTreeStatsType stats;
265  {
266  bool binary_in;
267  GaussClusterable gc; // dummy needed to provide type.
268  Input ki(stats_filename, &binary_in);
269  ReadBuildTreeStats(ki.Stream(), binary_in, gc, &stats);
270  }
271  KALDI_LOG << "Number of separate statistics is " << stats.size();
272 
273  HmmTopology topo;
274  ReadKaldiObject(topo_filename, &topo);
275 
276  const EventMap &to_pdf = ctx_dep.ToPdfMap(); // not owned here.
277 
278  TransitionModel trans_model(ctx_dep, topo);
279 
280  // Now, the summed_stats will be used to initialize the GMM.
281  AmDiagGmm am_gmm;
282  if (old_tree_filename.empty())
283  InitAmGmm(stats, to_pdf, &am_gmm, trans_model, var_floor); // Normal case: initialize 1 Gauss/model from tree stats.
284  else {
285  InitAmGmmFromOld(stats, to_pdf,
286  ctx_dep.ContextWidth(),
287  ctx_dep.CentralPosition(),
288  old_tree_filename,
289  old_model_filename,
290  var_floor,
291  &am_gmm);
292  }
293 
294  if (!occs_out_filename.empty()) { // write state occs
295  Vector<BaseFloat> occs;
296  GetOccs(stats, to_pdf, &occs);
297  Output ko(occs_out_filename, binary);
298  occs.Write(ko.Stream(), binary);
299  }
300 
301  {
302  Output ko(model_out_filename, binary);
303  trans_model.Write(ko.Stream(), binary);
304  am_gmm.Write(ko.Stream(), binary);
305  }
306  KALDI_LOG << "Wrote model.";
307 
308  DeleteBuildTreeStats(&stats);
309  } catch(const std::exception &e) {
310  std::cerr << e.what();
311  return -1;
312  }
313 }
This code computes Goodness of Pronunciation (GOP) and extracts phone-level pronunciation feature for...
Definition: chain.dox:20
virtual int32 ContextWidth() const
ContextWidth() returns the value N (e.g.
Definition: context-dep.h:61
const EventMap & ToPdfMap() const
Definition: context-dep.h:98
void GetOccs(const BuildTreeStatsType &stats, const EventMap &to_pdf_map, Vector< BaseFloat > *occs)
Get state occupation counts.
A class for storing topology information for phones.
Definition: hmm-topology.h:93
void InitAmGmm(const BuildTreeStatsType &stats, const EventMap &to_pdf_map, AmDiagGmm *am_gmm, const TransitionModel &trans_model, BaseFloat var_floor)
InitAmGmm initializes the GMM with one Gaussian per state.
void Write(std::ostream &Out, bool binary) const
Writes to C++ stream (option to write in binary).
kaldi::int32 int32
void ReadKaldiObject(const std::string &filename, Matrix< float > *m)
Definition: kaldi-io.cc:832
void DeleteBuildTreeStats(BuildTreeStatsType *stats)
This frees the Clusterable* pointers in "stats", where non-NULL, and sets them to NULL...
void ReadBuildTreeStats(std::istream &is, bool binary, const Clusterable &example, BuildTreeStatsType *stats)
Reads BuildTreeStats object.
The class ParseOptions is for parsing command-line options; see Parsing command-line options for more...
Definition: parse-options.h:36
void InitAmGmmFromOld(const BuildTreeStatsType &stats, const EventMap &to_pdf_map, int32 N, int32 P, const std::string &old_tree_rxfilename, const std::string &old_model_rxfilename, BaseFloat var_floor, AmDiagGmm *am_gmm)
InitAmGmmFromOld initializes the GMM based on a previously trained model and tree, which must require no more phonetic context than the current tree.
virtual int32 CentralPosition() const
Central position P of the phone context, in 0-based numbering, e.g.
Definition: context-dep.h:62
A class that is capable of representing a generic mapping from EventType (which is a vector of (key...
Definition: event-map.h:86
A class representing a vector.
Definition: kaldi-vector.h:406
void Write(std::ostream &out_stream, bool binary) const
Definition: am-diag-gmm.cc:163
std::vector< std::pair< EventType, Clusterable * > > BuildTreeStatsType
GaussClusterable wraps Gaussian statistics in a form accessible to generic clustering algorithms...
#define KALDI_LOG
Definition: kaldi-error.h:153