DiscriminativeNnetExample Struct Reference

This struct is used to store the information we need for discriminative training (MMI or MPE). More...

#include <nnet-example.h>

Collaboration diagram for DiscriminativeNnetExample:

Public Member Functions

void Check () const
 
void Write (std::ostream &os, bool binary) const
 
void Read (std::istream &is, bool binary)
 

Public Attributes

BaseFloat weight
 The weight we assign to this example; this will typically be one, but we include it for the sake of generality. More...
 
std::vector< int32num_ali
 The numerator alignment. More...
 
CompactLattice den_lat
 The denominator lattice. More...
 
Matrix< BaseFloatinput_frames
 The input data– typically with a number of frames [NumRows()] larger than labels.size(), because it includes features to the left and right as needed for the temporal context of the network. More...
 
int32 left_context
 The number of frames of left context in the features (we can work out the #frames of right context from input_frames.NumRows(), num_ali.size(), and this). More...
 
Vector< BaseFloatspk_info
 spk_info contains any component of the features that varies slowly or not at all with time (and hence, we would lose little by averaging it over time and storing the average). More...
 

Detailed Description

This struct is used to store the information we need for discriminative training (MMI or MPE).

Each example corresponds to one chunk of a file (for better randomization and to prevent instability, we may split files in the middle). The example contains the numerator alignment, the denominator lattice, and the input features (extended at the edges according to the left-context and right-context the network needs). It may also contain a speaker-vector (note: this is not part of any standard recipe right now but is included in case it's useful in the future).

Definition at line 136 of file nnet-example.h.

Member Function Documentation

◆ Check()

void Check ( ) const

Definition at line 295 of file nnet-example.cc.

References kaldi::CompactLatticeStateTimes(), NnetExample::input_frames, KALDI_ASSERT, NnetExample::left_context, and CompressedMatrix::NumRows().

Referenced by DiscriminativeExampleSplitter::DoExcise(), DiscriminativeExampleSplitter::Excise(), kaldi::nnet2::LatticeToDiscriminativeExample(), DiscriminativeExampleSplitter::OutputOneSplit(), and DiscriminativeExampleSplitter::Split().

295  {
296  KALDI_ASSERT(weight > 0.0);
297  KALDI_ASSERT(!num_ali.empty());
298  int32 num_frames = static_cast<int32>(num_ali.size());
299 
300 
301  std::vector<int32> times;
302  int32 num_frames_den = CompactLatticeStateTimes(den_lat, &times);
303  KALDI_ASSERT(num_frames == num_frames_den);
304  KALDI_ASSERT(input_frames.NumRows() >= left_context + num_frames);
305 }
kaldi::int32 int32
int32 CompactLatticeStateTimes(const CompactLattice &lat, vector< int32 > *times)
As LatticeStateTimes, but in the CompactLattice format.
CompactLattice den_lat
The denominator lattice.
Definition: nnet-example.h:148
Matrix< BaseFloat > input_frames
The input data– typically with a number of frames [NumRows()] larger than labels.size(), because it includes features to the left and right as needed for the temporal context of the network.
Definition: nnet-example.h:159
std::vector< int32 > num_ali
The numerator alignment.
Definition: nnet-example.h:143
BaseFloat weight
The weight we assign to this example; this will typically be one, but we include it for the sake of g...
Definition: nnet-example.h:140
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
int32 left_context
The number of frames of left context in the features (we can work out the #frames of right context fr...
Definition: nnet-example.h:164

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)

Definition at line 269 of file nnet-example.cc.

References kaldi::ExpectToken(), NnetExample::input_frames, KALDI_ERR, NnetExample::left_context, CompressedMatrix::Read(), kaldi::ReadBasicType(), kaldi::ReadCompactLattice(), kaldi::ReadIntegerVector(), and NnetExample::spk_info.

270  {
271  // Note: weight, num_ali, den_lat, input_frames, left_context and spk_info are
272  // members. This is a struct.
273  ExpectToken(is, binary, "<DiscriminativeNnetExample>");
274  ExpectToken(is, binary, "<Weight>");
275  ReadBasicType(is, binary, &weight);
276  ExpectToken(is, binary, "<NumAli>");
277  ReadIntegerVector(is, binary, &num_ali);
278  CompactLattice *den_lat_tmp = NULL;
279  if (!ReadCompactLattice(is, binary, &den_lat_tmp) || den_lat_tmp == NULL) {
280  // We can't return error status from this function so we
281  // throw an exception.
282  KALDI_ERR << "Error reading CompactLattice from stream";
283  }
284  den_lat = *den_lat_tmp;
285  delete den_lat_tmp;
286  ExpectToken(is, binary, "<InputFrames>");
287  input_frames.Read(is, binary);
288  ExpectToken(is, binary, "<LeftContext>");
289  ReadBasicType(is, binary, &left_context);
290  ExpectToken(is, binary, "<SpkInfo>");
291  spk_info.Read(is, binary);
292  ExpectToken(is, binary, "</DiscriminativeNnetExample>");
293 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void Read(std::istream &in, bool binary, bool add=false)
read from stream.
void ReadIntegerVector(std::istream &is, bool binary, std::vector< T > *v)
Function for reading STL vector of integer types.
Definition: io-funcs-inl.h:232
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:191
Vector< BaseFloat > spk_info
spk_info contains any component of the features that varies slowly or not at all with time (and hence...
Definition: nnet-example.h:171
#define KALDI_ERR
Definition: kaldi-error.h:147
CompactLattice den_lat
The denominator lattice.
Definition: nnet-example.h:148
Matrix< BaseFloat > input_frames
The input data– typically with a number of frames [NumRows()] larger than labels.size(), because it includes features to the left and right as needed for the temporal context of the network.
Definition: nnet-example.h:159
std::vector< int32 > num_ali
The numerator alignment.
Definition: nnet-example.h:143
BaseFloat weight
The weight we assign to this example; this will typically be one, but we include it for the sake of g...
Definition: nnet-example.h:140
fst::VectorFst< CompactLatticeArc > CompactLattice
Definition: kaldi-lattice.h:46
int32 left_context
The number of frames of left context in the features (we can work out the #frames of right context fr...
Definition: nnet-example.h:164
bool ReadCompactLattice(std::istream &is, bool binary, CompactLattice **clat)

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const

Definition at line 242 of file nnet-example.cc.

References NnetExample::input_frames, KALDI_ERR, NnetExample::left_context, NnetExample::spk_info, CompressedMatrix::Write(), kaldi::WriteBasicType(), kaldi::WriteCompactLattice(), kaldi::WriteIntegerVector(), and kaldi::WriteToken().

243  {
244  // Note: weight, num_ali, den_lat, input_frames, left_context and spk_info are
245  // members. This is a struct.
246  WriteToken(os, binary, "<DiscriminativeNnetExample>");
247  WriteToken(os, binary, "<Weight>");
248  WriteBasicType(os, binary, weight);
249  WriteToken(os, binary, "<NumAli>");
250  WriteIntegerVector(os, binary, num_ali);
251  if (!WriteCompactLattice(os, binary, den_lat)) {
252  // We can't return error status from this function so we
253  // throw an exception.
254  KALDI_ERR << "Error writing CompactLattice to stream";
255  }
256  WriteToken(os, binary, "<InputFrames>");
257  {
258  CompressedMatrix cm(input_frames); // Note: this can be read as a regular
259  // matrix.
260  cm.Write(os, binary);
261  }
262  WriteToken(os, binary, "<LeftContext>");
263  WriteBasicType(os, binary, left_context);
264  WriteToken(os, binary, "<SpkInfo>");
265  spk_info.Write(os, binary);
266  WriteToken(os, binary, "</DiscriminativeNnetExample>");
267 }
Vector< BaseFloat > spk_info
spk_info contains any component of the features that varies slowly or not at all with time (and hence...
Definition: nnet-example.h:171
#define KALDI_ERR
Definition: kaldi-error.h:147
CompactLattice den_lat
The denominator lattice.
Definition: nnet-example.h:148
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
Matrix< BaseFloat > input_frames
The input data– typically with a number of frames [NumRows()] larger than labels.size(), because it includes features to the left and right as needed for the temporal context of the network.
Definition: nnet-example.h:159
std::vector< int32 > num_ali
The numerator alignment.
Definition: nnet-example.h:143
BaseFloat weight
The weight we assign to this example; this will typically be one, but we include it for the sake of g...
Definition: nnet-example.h:140
bool WriteCompactLattice(std::ostream &os, bool binary, const CompactLattice &t)
void WriteIntegerVector(std::ostream &os, bool binary, const std::vector< T > &v)
Function for writing STL vectors of integer types.
Definition: io-funcs-inl.h:198
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34
int32 left_context
The number of frames of left context in the features (we can work out the #frames of right context fr...
Definition: nnet-example.h:164

Member Data Documentation

◆ den_lat

◆ input_frames

Matrix<BaseFloat> input_frames

The input data– typically with a number of frames [NumRows()] larger than labels.size(), because it includes features to the left and right as needed for the temporal context of the network.

(see also the left_context variable). Caution: when we write this to disk, we do so as CompressedMatrix. Because we do various manipulations on these things in memory, such as splitting, we don't want it to be a CompressedMatrix in memory as this would be wasteful in time and also would lead to further loss of accuracy.

Definition at line 159 of file nnet-example.h.

Referenced by kaldi::nnet2::AppendDiscriminativeExamples(), kaldi::nnet2::AverageConstPart(), DiscriminativeExampleSplitter::DoExcise(), NnetDiscriminativeUpdater::GetInputFeatures(), kaldi::nnet2::LatticeToDiscriminativeExample(), main(), NnetDiscriminativeUpdater::NnetDiscriminativeUpdater(), DiscriminativeExampleSplitter::OutputOneSplit(), DiscriminativeExampleSplitter::RightContext(), and kaldi::nnet2::UpdateHash().

◆ left_context

int32 left_context

The number of frames of left context in the features (we can work out the #frames of right context from input_frames.NumRows(), num_ali.size(), and this).

Definition at line 164 of file nnet-example.h.

Referenced by kaldi::nnet2::AppendDiscriminativeExamples(), DiscriminativeExampleSplitter::DoExcise(), NnetDiscriminativeUpdater::GetInputFeatures(), kaldi::nnet2::LatticeToDiscriminativeExample(), DiscriminativeExampleSplitter::OutputOneSplit(), DiscriminativeExampleSplitter::RightContext(), and kaldi::nnet2::UpdateHash().

◆ num_ali

◆ spk_info

Vector<BaseFloat> spk_info

spk_info contains any component of the features that varies slowly or not at all with time (and hence, we would lose little by averaging it over time and storing the average).

We'll append this to each of the input features, if used.

Definition at line 171 of file nnet-example.h.

Referenced by kaldi::nnet2::AppendDiscriminativeExamples(), kaldi::nnet2::AverageConstPart(), DiscriminativeExampleSplitter::DoExcise(), DiscriminativeExampleSplitter::OutputOneSplit(), and NnetDiscriminativeUpdater::Propagate().

◆ weight


The documentation for this struct was generated from the following files: