NnetExample is the input data and corresponding label (or labels) for one or more frames of input, used for standard cross-entropy training of neural nets (and possibly for other objective functions). More...

#include <nnet-example.h>

Collaboration diagram for NnetExample:

[legend]

Public Member Functions
void	Write (std::ostream &os, bool binary) const

void	Read (std::istream &is, bool binary)

	NnetExample ()

	NnetExample (const NnetExample &input, int32 start_frame, int32 num_frames, int32 left_context, int32 right_context)
	This constructor can be used to extract one or more frames from an example that has multiple frames, and possibly truncate the context. More...

void	SetLabelSingle (int32 frame, int32 pdf_id, BaseFloat weight=1.0)
	Set the label of this frame of this example to the specified pdf_id with the specified weight. More...

int32	GetLabelSingle (int32 frame, BaseFloat *weight=NULL)
	Get the maximum weight label (pdf_id and weight) of this frame of this example. More...

Public Attributes
std::vector< std::vector< std::pair< int32, BaseFloat > > >	labels
	The label(s) for each frame in a sequence of frames; in the normal case, this will be just [ [ (pdf-id, 1.0) ] ], i.e. More...

CompressedMatrix	input_frames
	The input data, with NumRows() >= labels.size() + left_context; it includes features to the left and right as needed for the temporal context of the network. More...

int32	left_context
	The number of frames of left context (we can work out the #frames of right context from input_frames.NumRows(), labels.size(), and this). More...

Vector< BaseFloat >	spk_info
	The speaker-specific input, if any, or an empty vector if we're not using this features. More...

Detailed Description

NnetExample is the input data and corresponding label (or labels) for one or more frames of input, used for standard cross-entropy training of neural nets (and possibly for other objective functions).

In the normal case there will be just one frame, and one label, with a weight of 1.0.

Definition at line 36 of file nnet-example.h.

Constructor & Destructor Documentation

◆ NnetExample() [1/2]

NnetExample ( )

inline

Definition at line 63 of file nnet-example.h.

References NnetExample::GetLabelSingle(), and NnetExample::SetLabelSingle().

63 { }

◆ NnetExample() [2/2]

NnetExample	(	const NnetExample &	input,
		int32	start_frame,
		int32	num_frames,
		int32	left_context,
		int32	right_context
	)

This constructor can be used to extract one or more frames from an example that has multiple frames, and possibly truncate the context.

Most of its behavior is obvious from the variable names, but note the following: if left_context is -1, we use the left-context of the input; the same for right_context. If start_frame < 0 we start the labels from frame 0 of the labeled frames of ,input; if num_frames == -1 we go to the end of the labeled input from start_frame. If start_frame + num_frames is greater than the number of frames of labels of input, we output as much as we can instead of crashing. The same with left_context and right_context– if we can't provide the requested context we won't crash but will provide as much as we can, although in this case we'll print a warning (once).

Definition at line 159 of file nnet-example.cc.

References NnetExample::input_frames, KALDI_ASSERT, KALDI_WARN, NnetExample::labels, NnetExample::left_context, kaldi::nnet2::nnet_example_warned_right, CompressedMatrix::NumCols(), CompressedMatrix::NumRows(), and CompressedMatrix::Swap().

                                                  : spk_info(input.spk_info) {
   int32 num_label_frames = input.labels.size();
   if (start_frame < 0) start_frame = 0;  // start_frame is offset in the labeled
                                          // frames.
   KALDI_ASSERT(start_frame < num_label_frames);
   if (start_frame + new_num_frames > num_label_frames || new_num_frames == -1)
     new_num_frames = num_label_frames - start_frame;
   // compute right-context of input.
   int32 input_right_context =
       input.input_frames.NumRows() - input.left_context - num_label_frames;
   if (new_left_context == -1) new_left_context = input.left_context;
   if (new_right_context == -1) new_right_context = input_right_context;
   if (new_left_context > input.left_context) {
     if (!nnet_example_warned_left) {
       nnet_example_warned_left = true;
       KALDI_WARN << "Requested left-context " << new_left_context
                  << " exceeds input left-context " << input.left_context
                  << ", will not warn again.";
     }
     new_left_context = input.left_context;
   }
   if (new_right_context > input_right_context) {
     if (!nnet_example_warned_right) {
       nnet_example_warned_right = true;
       KALDI_WARN << "Requested right-context " << new_right_context
                  << " exceeds input right-context " << input_right_context
                  << ", will not warn again.";
     }
     new_right_context = input_right_context;
   }
 
   int32 new_tot_frames = new_left_context + new_num_frames + new_right_context,
       left_frames_lost = (input.left_context - new_left_context) + start_frame;
   
   CompressedMatrix new_input_frames(input.input_frames,
                                     left_frames_lost,
                                     new_tot_frames,
                                     0, input.input_frames.NumCols());
   new_input_frames.Swap(&input_frames);  // swap with class-member.
   left_context = new_left_context;  // set class-member.
   labels.clear();
   labels.insert(labels.end(),
                 input.labels.begin() + start_frame,
                 input.labels.begin() + start_frame + new_num_frames);
 }

Member Function Documentation

◆ GetLabelSingle()

int32 GetLabelSingle	(	int32	frame,
		BaseFloat *	weight = `NULL`
	)

Get the maximum weight label (pdf_id and weight) of this frame of this example.

Definition at line 140 of file nnet-example.cc.

References rnnlm::i, KALDI_ASSERT, and NnetExample::labels.

Referenced by NnetExample::NnetExample().

                                                                 {
   BaseFloat max = -1.0;
   int32 pdf_id = -1;
   KALDI_ASSERT(static_cast<size_t>(frame) < labels.size());
   for (int32 i = 0; i < labels[frame].size(); i++) {
     if (labels[frame][i].second > max) {
       pdf_id = labels[frame][i].first;
       max = labels[frame][i].second;
     }
   }
   if (weight != NULL) *weight = max;
   return pdf_id;
 }

◆ Read()

void Read	(	std::istream &	is,
		bool	binary
	)

Definition at line 80 of file nnet-example.cc.

References kaldi::ExpectToken(), rnnlm::i, NnetExample::input_frames, KALDI_ASSERT, KALDI_ERR, NnetExample::labels, NnetExample::left_context, CompressedMatrix::Read(), kaldi::ReadBasicType(), kaldi::ReadIntegerVector(), kaldi::ReadToken(), and NnetExample::spk_info.

                                                   {
   // Note: weight, label, input_frames, left_context and spk_info are members.
   // This is a struct.
   ExpectToken(is, binary, "<NnetExample>");
 
   std::string token;
   ReadToken(is, binary, &token);
   if (!strcmp(token.c_str(), "<Lab1>")) {  // simple label format
     std::vector<int32> simple_labels;
     ReadIntegerVector(is, binary, &simple_labels);
     labels.resize(simple_labels.size());
     for (size_t i = 0; i < simple_labels.size(); i++) {
       labels[i].resize(1);
       labels[i][0].first = simple_labels[i];
       labels[i][0].second = 1.0;
     }
   } else if (!strcmp(token.c_str(), "<Lab2>")) {  // generic label format
     int32 num_frames;
     ReadBasicType(is, binary, &num_frames);
     KALDI_ASSERT(num_frames > 0);
     labels.resize(num_frames);
     for (int32 t = 0; t < num_frames; t++) {
       int32 size;
       ReadBasicType(is, binary, &size);
       KALDI_ASSERT(size >= 0);
       labels[t].resize(size);
       for (int32 i = 0; i < size; i++) {
         ReadBasicType(is, binary, &(labels[t][i].first));
         ReadBasicType(is, binary, &(labels[t][i].second));
       }
     }
   } else if (!strcmp(token.c_str(), "<Labels>")) {  // back-compatibility
     labels.resize(1);  // old format had 1 frame of labels.
     int32 size;
     ReadBasicType(is, binary, &size);
     labels[0].resize(size);
     for (int32 i = 0; i < size; i++) {
       ReadBasicType(is, binary, &(labels[0][i].first));
       ReadBasicType(is, binary, &(labels[0][i].second));
     }
   } else {
     KALDI_ERR << "Expected token <Lab1>, <Lab2> or <Labels>, got " << token;
   }
   ExpectToken(is, binary, "<InputFrames>");
   input_frames.Read(is, binary);
   ExpectToken(is, binary, "<LeftContext>"); // Note: this member is
   // recently added, but I don't think we'll get too much back-compatibility
   // problems from not handling the old format.
   ReadBasicType(is, binary, &left_context);
   ExpectToken(is, binary, "<SpkInfo>");
   spk_info.Read(is, binary);
   ExpectToken(is, binary, "</NnetExample>");
 }

◆ SetLabelSingle()

void SetLabelSingle	(	int32	frame,
		int32	pdf_id,
		BaseFloat	weight = `1.0`
	)

Set the label of this frame of this example to the specified pdf_id with the specified weight.

Definition at line 134 of file nnet-example.cc.

References KALDI_ASSERT, and NnetExample::labels.

Referenced by NnetExample::NnetExample().

                                                                             {
   KALDI_ASSERT(static_cast<size_t>(frame) < labels.size());
   labels[frame].clear();
   labels[frame].push_back(std::make_pair(pdf_id, weight));
 }

◆ Write()

void Write	(	std::ostream &	os,
		bool	binary
	)		const

Definition at line 46 of file nnet-example.cc.

References kaldi::nnet2::HasSimpleLabels(), rnnlm::i, NnetExample::input_frames, NnetExample::labels, NnetExample::left_context, NnetExample::spk_info, CompressedMatrix::Write(), kaldi::WriteBasicType(), kaldi::WriteIntegerVector(), and kaldi::WriteToken().

                                                          {
   // Note: weight, label, input_frames and spk_info are members.  This is a
   // struct.
   WriteToken(os, binary, "<NnetExample>");
 
   // At this point, we write <Lab1> if we have "simple" labels, or
   // <Lab2> in general.  Previous code (when we had only one frame of
   // labels) just wrote <Labels>.
   std::vector<int32> simple_labels;
   if (HasSimpleLabels(*this, &simple_labels)) {
     WriteToken(os, binary, "<Lab1>");
     WriteIntegerVector(os, binary, simple_labels);
   } else {
     WriteToken(os, binary, "<Lab2>");
     int32 num_frames = labels.size();
     WriteBasicType(os, binary, num_frames);
     for (int32 t = 0; t < num_frames; t++) {
       int32 size = labels[t].size();
       WriteBasicType(os, binary, size);
       for (int32 i = 0; i < size; i++) {
         WriteBasicType(os, binary, labels[t][i].first);
         WriteBasicType(os, binary, labels[t][i].second);
       }
     }
   }
   WriteToken(os, binary, "<InputFrames>");
   input_frames.Write(os, binary);
   WriteToken(os, binary, "<LeftContext>");
   WriteBasicType(os, binary, left_context);
   WriteToken(os, binary, "<SpkInfo>");
   spk_info.Write(os, binary);
   WriteToken(os, binary, "</NnetExample>");
 }

Member Data Documentation

◆ input_frames

CompressedMatrix input_frames

The input data, with NumRows() >= labels.size() + left_context; it includes features to the left and right as needed for the temporal context of the network.

The features corresponding to labels[0] would be in the row with index equal to left_context.

Definition at line 49 of file nnet-example.h.

Referenced by DiscriminativeNnetExample::Check(), main(), NnetExample::NnetExample(), kaldi::nnet2::ProcessFile(), NnetExample::Read(), DiscriminativeNnetExample::Read(), NnetExample::Write(), and DiscriminativeNnetExample::Write().

◆ labels

std::vector<std::vector<std::pair<int32, BaseFloat> > > labels

The label(s) for each frame in a sequence of frames; in the normal case, this will be just [ [ (pdf-id, 1.0) ] ], i.e.

one frame with one label. Top-level index is the frame index; then for each frame, a list of pdf-ids each with its weight. In some contexts, we will require that labels.size() == 1.

Definition at line 43 of file nnet-example.h.

Referenced by NnetExample::GetLabelSingle(), kaldi::nnet2::HasSimpleLabels(), main(), NnetExample::NnetExample(), kaldi::nnet2::ProcessFile(), NnetExample::Read(), NnetExample::SetLabelSingle(), and NnetExample::Write().