All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
DiscriminativeExampleMerger Class Reference

This class is responsible for arranging examples in groups that have the same strucure (i.e. More...

#include <nnet-discriminative-example.h>

Collaboration diagram for DiscriminativeExampleMerger:

Public Member Functions

 DiscriminativeExampleMerger (const ExampleMergingConfig &config, NnetDiscriminativeExampleWriter *writer)
 
void AcceptExample (NnetDiscriminativeExample *a)
 
void Finish ()
 
int32 ExitStatus ()
 
 ~DiscriminativeExampleMerger ()
 

Private Types

typedef unordered_map
< NnetDiscriminativeExample
*, std::vector
< NnetDiscriminativeExample * >
, NnetDiscriminativeExampleStructureHasher,
NnetDiscriminativeExampleStructureCompare
MapType
 

Private Member Functions

void WriteMinibatch (std::vector< NnetDiscriminativeExample > *egs)
 

Private Attributes

bool finished_
 
int32 num_egs_written_
 
const ExampleMergingConfigconfig_
 
NnetDiscriminativeExampleWriterwriter_
 
ExampleMergingStats stats_
 
MapType eg_to_egs_
 

Detailed Description

This class is responsible for arranging examples in groups that have the same strucure (i.e.

the same input and output indexes), and outputting them in suitable minibatches as defined by ExampleMergingConfig.

Definition at line 228 of file nnet-discriminative-example.h.

Member Typedef Documentation

Constructor & Destructor Documentation

Member Function Documentation

void AcceptExample ( NnetDiscriminativeExample a)

Definition at line 457 of file nnet-discriminative-example.cc.

References DiscriminativeExampleMerger::config_, DiscriminativeExampleMerger::eg_to_egs_, DiscriminativeExampleMerger::finished_, kaldi::nnet3::GetNnetDiscriminativeExampleSize(), rnnlm::i, KALDI_ASSERT, ExampleMergingConfig::MinibatchSize(), and DiscriminativeExampleMerger::WriteMinibatch().

Referenced by main().

457  {
459  // If an eg with the same structure as 'eg' is already a key in the
460  // map, it won't be replaced, but if it's new it will be made
461  // the key. Also we remove the key before making the vector empty.
462  // This way we ensure that the eg in the key is always the first
463  // element of the vector.
464  std::vector<NnetDiscriminativeExample*> &vec = eg_to_egs_[eg];
465  vec.push_back(eg);
466  int32 eg_size = GetNnetDiscriminativeExampleSize(*eg),
467  num_available = vec.size();
468  bool input_ended = false;
469  int32 minibatch_size = config_.MinibatchSize(eg_size, num_available,
470  input_ended);
471  if (minibatch_size != 0) { // we need to write out a merged eg.
472  KALDI_ASSERT(minibatch_size == num_available);
473 
474  std::vector<NnetDiscriminativeExample*> vec_copy(vec);
475  eg_to_egs_.erase(eg);
476 
477  // MergeDiscriminativeExamples() expects a vector of NnetDiscriminativeExample, not of pointers,
478  // so use swap to create that without doing any real work.
479  std::vector<NnetDiscriminativeExample> egs_to_merge(minibatch_size);
480  for (int32 i = 0; i < minibatch_size; i++) {
481  egs_to_merge[i].Swap(vec_copy[i]);
482  delete vec_copy[i]; // we owned those pointers.
483  }
484  WriteMinibatch(&egs_to_merge);
485  }
486 }
void WriteMinibatch(std::vector< NnetDiscriminativeExample > *egs)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
int32 GetNnetDiscriminativeExampleSize(const NnetDiscriminativeExample &a)
int32 MinibatchSize(int32 size_of_eg, int32 num_available_egs, bool input_ended) const
This function tells you what minibatch size should be used for this eg.
void Finish ( )

Definition at line 503 of file nnet-discriminative-example.cc.

References DiscriminativeExampleMerger::config_, ExampleMergingStats::DiscardedExamples(), DiscriminativeExampleMerger::eg_to_egs_, DiscriminativeExampleMerger::finished_, kaldi::nnet3::GetNnetDiscriminativeExampleSize(), rnnlm::i, KALDI_ASSERT, ExampleMergingConfig::MinibatchSize(), ExampleMergingStats::PrintStats(), DiscriminativeExampleMerger::stats_, and DiscriminativeExampleMerger::WriteMinibatch().

Referenced by DiscriminativeExampleMerger::ExitStatus(), main(), and DiscriminativeExampleMerger::~DiscriminativeExampleMerger().

503  {
504  if (finished_) return; // already finished.
505  finished_ = true;
506 
507  // we'll convert the map eg_to_egs_ to a vector of vectors to avoid
508  // iterator invalidation problems.
509  std::vector<std::vector<NnetDiscriminativeExample*> > all_egs;
510  all_egs.reserve(eg_to_egs_.size());
511 
512  MapType::iterator iter = eg_to_egs_.begin(), end = eg_to_egs_.end();
513  for (; iter != end; ++iter)
514  all_egs.push_back(iter->second);
515  eg_to_egs_.clear();
516 
517  for (size_t i = 0; i < all_egs.size(); i++) {
518  int32 minibatch_size;
519  std::vector<NnetDiscriminativeExample*> &vec = all_egs[i];
520  KALDI_ASSERT(!vec.empty());
521  int32 eg_size = GetNnetDiscriminativeExampleSize(*(vec[0]));
522  bool input_ended = true;
523  while (!vec.empty() &&
524  (minibatch_size = config_.MinibatchSize(eg_size, vec.size(),
525  input_ended)) != 0) {
526  // MergeDiscriminativeExamples() expects a vector of
527  // NnetDiscriminativeExample, not of pointers, so use swap to create that
528  // without doing any real work.
529  std::vector<NnetDiscriminativeExample> egs_to_merge(minibatch_size);
530  for (int32 i = 0; i < minibatch_size; i++) {
531  egs_to_merge[i].Swap(vec[i]);
532  delete vec[i]; // we owned those pointers.
533  }
534  vec.erase(vec.begin(), vec.begin() + minibatch_size);
535  WriteMinibatch(&egs_to_merge);
536  }
537  if (!vec.empty()) {
538  int32 eg_size = GetNnetDiscriminativeExampleSize(*(vec[0]));
539  NnetDiscriminativeExampleStructureHasher eg_hasher;
540  size_t structure_hash = eg_hasher(*(vec[0]));
541  int32 num_discarded = vec.size();
542  stats_.DiscardedExamples(eg_size, structure_hash, num_discarded);
543  for (int32 i = 0; i < num_discarded; i++)
544  delete vec[i];
545  vec.clear();
546  }
547  }
548  stats_.PrintStats();
549 }
void DiscardedExamples(int32 example_size, size_t structure_hash, int32 num_discarded)
Users call this function to inform this class that after processing all the data, for examples of ori...
void WriteMinibatch(std::vector< NnetDiscriminativeExample > *egs)
void PrintStats() const
Calling this will cause a log message with information about the examples to be printed.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
int32 GetNnetDiscriminativeExampleSize(const NnetDiscriminativeExample &a)
int32 MinibatchSize(int32 size_of_eg, int32 num_available_egs, bool input_ended) const
This function tells you what minibatch size should be used for this eg.
void WriteMinibatch ( std::vector< NnetDiscriminativeExample > *  egs)
private

Definition at line 488 of file nnet-discriminative-example.cc.

References ExampleMergingConfig::compress, DiscriminativeExampleMerger::config_, kaldi::nnet3::GetNnetDiscriminativeExampleSize(), KALDI_ASSERT, kaldi::nnet3::MergeDiscriminativeExamples(), DiscriminativeExampleMerger::num_egs_written_, DiscriminativeExampleMerger::stats_, TableWriter< Holder >::Write(), DiscriminativeExampleMerger::writer_, and ExampleMergingStats::WroteExample().

Referenced by DiscriminativeExampleMerger::AcceptExample(), and DiscriminativeExampleMerger::Finish().

489  {
490  KALDI_ASSERT(!egs->empty());
491  int32 eg_size = GetNnetDiscriminativeExampleSize((*egs)[0]);
492  NnetDiscriminativeExampleStructureHasher eg_hasher;
493  size_t structure_hash = eg_hasher((*egs)[0]);
494  int32 minibatch_size = egs->size();
495  stats_.WroteExample(eg_size, structure_hash, minibatch_size);
496  NnetDiscriminativeExample merged_eg;
497  MergeDiscriminativeExamples(config_.compress, egs, &merged_eg);
498  std::ostringstream key;
499  key << "merged-" << (num_egs_written_++) << "-" << minibatch_size;
500  writer_->Write(key.str(), merged_eg);
501 }
void Write(const std::string &key, const T &value) const
void WroteExample(int32 example_size, size_t structure_hash, int32 minibatch_size)
Users call this function to inform this class that one minibatch has been written aggregating 'miniba...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void MergeDiscriminativeExamples(bool compress, std::vector< NnetDiscriminativeExample > *input, NnetDiscriminativeExample *output)
int32 GetNnetDiscriminativeExampleSize(const NnetDiscriminativeExample &a)

Member Data Documentation

bool finished_
private
int32 num_egs_written_
private

The documentation for this class was generated from the following files: