gmm-init-model-flat.cc File Reference
Include dependency graph for gmm-init-model-flat.cc:

Go to the source code of this file.

Namespaces

 kaldi
 This code computes Goodness of Pronunciation (GOP) and extracts phone-level pronunciation feature for mispronunciations detection tasks, the reference:
 

Functions

void GetFeatureMeanAndVariance (const std::string &feat_rspecifier, Vector< BaseFloat > *inv_var_out, Vector< BaseFloat > *mean_out)
 
int main (int argc, char *argv[])
 

Function Documentation

◆ main()

int main ( int  argc,
char *  argv[] 
)

Definition at line 70 of file gmm-init-model-flat.cc.

References AmDiagGmm::AddPdf(), DiagGmm::ComputeGconsts(), VectorBase< Real >::Dim(), ParseOptions::GetArg(), kaldi::GetFeatureMeanAndVariance(), ParseOptions::GetOptArg(), rnnlm::i, KALDI_LOG, ParseOptions::NumArgs(), ContextDependency::NumPdfs(), ParseOptions::PrintUsage(), ParseOptions::Read(), kaldi::ReadKaldiObject(), ParseOptions::Register(), DiagGmm::Resize(), Vector< Real >::Resize(), MatrixBase< Real >::Row(), VectorBase< Real >::Set(), DiagGmm::SetInvVarsAndMeans(), DiagGmm::SetWeights(), Output::Stream(), AmDiagGmm::Write(), and TransitionModel::Write().

70  {
71  using namespace kaldi;
72  try {
73  using namespace kaldi;
74  typedef kaldi::int32 int32;
75 
76  const char *usage =
77  "Initialize GMM, with Gaussians initialized to mean and variance\n"
78  "of some provided example data (or to 0,1 if not provided: in that\n"
79  "case, provide --dim option)\n"
80  "Usage: gmm-init-model-flat [options] <tree-in> <topo-file> <model-out> [<features-rspecifier>]\n"
81  "e.g.: \n"
82  " gmm-init-model-flat tree topo 1.mdl ark:feats.scp\n";
83 
84  bool binary = true;
85  int32 dim = 40;
86 
87  ParseOptions po(usage);
88  po.Register("binary", &binary, "Write output in binary mode");
89  po.Register("dim", &dim, "Dimension of model (this matters only if not providing features).");
90 
91  po.Read(argc, argv);
92 
93  if (po.NumArgs() < 3 || po.NumArgs() > 4) {
94  po.PrintUsage();
95  exit(1);
96  }
97 
98  std::string
99  tree_filename = po.GetArg(1),
100  topo_filename = po.GetArg(2),
101  model_out_filename = po.GetArg(3),
102  feats_rspecifier = po.GetOptArg(4);
103 
104  ContextDependency ctx_dep;
105  ReadKaldiObject(tree_filename, &ctx_dep);
106 
107  HmmTopology topo;
108  ReadKaldiObject(topo_filename, &topo);
109 
110  Vector<BaseFloat> global_inverse_var, global_mean;
111  if (po.NumArgs() == 4) {
112  GetFeatureMeanAndVariance(feats_rspecifier,
113  &global_inverse_var,
114  &global_mean);
115  dim = global_mean.Dim();
116  } else {
117  global_inverse_var.Resize(dim);
118  global_inverse_var.Set(1.0);
119  global_mean.Resize(dim); // leave it at zero.
120  }
121 
122  int32 num_pdfs = ctx_dep.NumPdfs();
123 
124  AmDiagGmm am_gmm;
125  DiagGmm gmm;
126  gmm.Resize(1, dim);
127  { // Initialize the gmm.
128  Matrix<BaseFloat> inv_var(1, dim);
129  inv_var.Row(0).CopyFromVec(global_inverse_var);
130  Matrix<BaseFloat> mu(1, dim);
131  mu.Row(0).CopyFromVec(global_mean);
132  Vector<BaseFloat> weights(1);
133  weights.Set(1.0);
134  gmm.SetInvVarsAndMeans(inv_var, mu);
135  gmm.SetWeights(weights);
136  gmm.ComputeGconsts();
137  }
138  for (int i = 0; i < num_pdfs; i++)
139  am_gmm.AddPdf(gmm);
140 
141  TransitionModel trans_model(ctx_dep, topo);
142 
143  {
144  Output ko(model_out_filename, binary);
145  trans_model.Write(ko.Stream(), binary);
146  am_gmm.Write(ko.Stream(), binary);
147  }
148  KALDI_LOG << "Wrote model.";
149  } catch(const std::exception &e) {
150  std::cerr << e.what();
151  return -1;
152  }
153 }
This code computes Goodness of Pronunciation (GOP) and extracts phone-level pronunciation feature for...
Definition: chain.dox:20
void AddPdf(const DiagGmm &gmm)
Adds a GMM to the model, and increments the total number of PDFs.
Definition: am-diag-gmm.cc:57
void SetInvVarsAndMeans(const MatrixBase< Real > &invvars, const MatrixBase< Real > &means)
Use SetInvVarsAndMeans if updating both means and (inverse) variances.
Definition: diag-gmm-inl.h:63
A class for storing topology information for phones.
Definition: hmm-topology.h:93
void Resize(int32 nMix, int32 dim)
Resizes arrays to this dim. Does not initialize data.
Definition: diag-gmm.cc:66
int32 ComputeGconsts()
Sets the gconsts.
Definition: diag-gmm.cc:114
kaldi::int32 int32
void Resize(MatrixIndexT length, MatrixResizeType resize_type=kSetZero)
Set vector to a specified size (can be zero).
void ReadKaldiObject(const std::string &filename, Matrix< float > *m)
Definition: kaldi-io.cc:832
virtual int32 NumPdfs() const
NumPdfs() returns the number of acoustic pdfs (they are numbered 0.. NumPdfs()-1).
Definition: context-dep.h:71
The class ParseOptions is for parsing command-line options; see Parsing command-line options for more...
Definition: parse-options.h:36
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:64
A class representing a vector.
Definition: kaldi-vector.h:406
void Set(Real f)
Set all members of a vector to a specified value.
void Write(std::ostream &out_stream, bool binary) const
Definition: am-diag-gmm.cc:163
Definition for Gaussian Mixture Model with diagonal covariances.
Definition: diag-gmm.h:42
void SetWeights(const VectorBase< Real > &w)
Mutators for both float or double.
Definition: diag-gmm-inl.h:28
#define KALDI_LOG
Definition: kaldi-error.h:153
void GetFeatureMeanAndVariance(const std::string &feat_rspecifier, Vector< BaseFloat > *inv_var_out, Vector< BaseFloat > *mean_out)