All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
logprob-to-post.cc File Reference
#include "base/kaldi-common.h"
#include "util/common-utils.h"
#include "gmm/am-diag-gmm.h"
#include "hmm/transition-model.h"
#include "hmm/hmm-utils.h"
#include "hmm/posterior.h"
Include dependency graph for logprob-to-post.cc:

Go to the source code of this file.

Functions

int main (int argc, char *argv[])
 

Function Documentation

int main ( int  argc,
char *  argv[] 
)

Definition at line 39 of file logprob-to-post.cc.

References VectorBase< Real >::Dim(), SequentialTableReader< Holder >::Done(), kaldi::Exp(), ParseOptions::GetArg(), rnnlm::i, rnnlm::j, KALDI_LOG, SequentialTableReader< Holder >::Key(), SequentialTableReader< Holder >::Next(), ParseOptions::NumArgs(), MatrixBase< Real >::NumRows(), ParseOptions::PrintUsage(), kaldi::RandUniform(), ParseOptions::Read(), ParseOptions::Register(), SequentialTableReader< Holder >::Value(), and TableWriter< Holder >::Write().

39  {
40  using namespace kaldi;
41  typedef kaldi::int32 int32;
42  try {
43  const char *usage =
44  "Convert a matrix of log-probabilities (e.g. from nnet-logprob) to posteriors\n"
45  "Usage: logprob-to-post [options] <logprob-matrix-rspecifier> <posteriors-wspecifier>\n"
46  "e.g.:\n"
47  " nnet-logprob [args] | logprob-to-post ark:- ark:1.post\n"
48  "Caution: in this particular example, the output would be posteriors of pdf-ids,\n"
49  "rather than transition-ids (c.f. post-to-pdf-post)\n";
50 
51  ParseOptions po(usage);
52 
53  BaseFloat min_post = 0.01;
54  bool random_prune = true; // preserve expectations.
55 
56  po.Register("min-post", &min_post, "Minimum posterior we will output (smaller "
57  "ones are pruned). Also see --random-prune");
58  po.Register("random-prune", &random_prune, "If true, prune posteriors with a "
59  "randomized method that preserves expectations.");
60 
61  po.Read(argc, argv);
62 
63  if (po.NumArgs() != 2) {
64  po.PrintUsage();
65  exit(1);
66  }
67 
68  std::string logprob_rspecifier = po.GetArg(1);
69  std::string posteriors_wspecifier = po.GetArg(2);
70 
71  int32 num_done = 0;
72  SequentialBaseFloatMatrixReader logprob_reader(logprob_rspecifier);
73  PosteriorWriter posterior_writer(posteriors_wspecifier);
74 
75  for (; !logprob_reader.Done(); logprob_reader.Next()) {
76  num_done++;
77  const Matrix<BaseFloat> &logprobs = logprob_reader.Value();
78  // Posterior is vector<vector<pair<int32, BaseFloat> > >
79  Posterior post(logprobs.NumRows());
80  for (int32 i = 0; i < logprobs.NumRows(); i++) {
81  SubVector<BaseFloat> row(logprobs, i);
82  for (int32 j = 0; j < row.Dim(); j++) {
83  BaseFloat p = Exp(row(j));
84  if (p >= min_post) {
85  post[i].push_back(std::make_pair(j, p));
86  } else if (random_prune && (p / min_post) >= RandUniform()) {
87  post[i].push_back(std::make_pair(j, min_post));
88  }
89  }
90  }
91  posterior_writer.Write(logprob_reader.Key(), post);
92  }
93  KALDI_LOG << "Converted " << num_done << " log-prob matrices to posteriors.";
94  return (num_done != 0 ? 0 : 1);
95  } catch(const std::exception &e) {
96  std::cerr << e.what();
97  return -1;
98  }
99 }
Relabels neural network egs with the read pdf-id alignments.
Definition: chain.dox:20
double Exp(double x)
Definition: kaldi-math.h:83
float RandUniform(struct RandomState *state=NULL)
Returns a random number strictly between 0 and 1.
Definition: kaldi-math.h:151
A templated class for writing objects to an archive or script file; see The Table concept...
Definition: kaldi-table.h:368
float BaseFloat
Definition: kaldi-types.h:29
std::vector< std::vector< std::pair< int32, BaseFloat > > > Posterior
Posterior is a typedef for storing acoustic-state (actually, transition-id) posteriors over an uttera...
Definition: posterior.h:43
The class ParseOptions is for parsing command-line options; see Parsing command-line options for more...
Definition: parse-options.h:36
A templated class for reading objects sequentially from an archive or script file; see The Table conc...
Definition: kaldi-table.h:287
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
Definition: kaldi-matrix.h:61
#define KALDI_LOG
Definition: kaldi-error.h:133
Represents a non-allocating general vector which can be defined as a sub-vector of higher-level vecto...
Definition: kaldi-vector.h:482