All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
CachingOptimizingCompiler Class Reference

This class enables you to do the compilation and optimization in one call, and also ensures that if the ComputationRequest is identical to the previous one, the compilation process is not repeated. More...

#include <nnet-optimize.h>

Collaboration diagram for CachingOptimizingCompiler:

Public Member Functions

 CachingOptimizingCompiler (const Nnet &nnet, const CachingOptimizingCompilerOptions config=CachingOptimizingCompilerOptions())
 
 CachingOptimizingCompiler (const Nnet &nnet, const NnetOptimizeOptions &opt_config, const CachingOptimizingCompilerOptions config=CachingOptimizingCompilerOptions())
 Note: nnet is retained as a const reference but opt_config is copied. More...
 
 ~CachingOptimizingCompiler ()
 
const NnetComputationCompile (const ComputationRequest &request)
 Does the compilation and returns a const pointer to the result, which is owned by this class, not the caller. More...
 
void ReadCache (std::istream &is, bool binary)
 
void WriteCache (std::ostream &os, bool binary) const
 

Private Types

typedef std::list< const
ComputationRequest * > 
AqType
 
typedef unordered_map< const
ComputationRequest
*, std::pair< const
NnetComputation
*, AqType::iterator >
, ComputationRequestHasher,
ComputationRequestPtrEqual
CacheType
 

Private Member Functions

const NnetComputationCompileInternal (const ComputationRequest &request)
 
const NnetComputationCompileAndCache (const ComputationRequest &request)
 
const NnetComputationCompileViaShortcut (const ComputationRequest &request)
 
const NnetComputationCompileNoShortcut (const ComputationRequest &request)
 
void UpdateCache (const ComputationRequest *request, const NnetComputation *computation)
 
void UpdateAccessQueue (CacheType::iterator &cit)
 

Private Attributes

const Nnetnnet_
 
CachingOptimizingCompilerOptions config_
 
NnetOptimizeOptions opt_config_
 
AqType access_queue_
 
CacheType computation_cache_
 
double seconds_taken_total_
 
double seconds_taken_compile_
 
double seconds_taken_optimize_
 
double seconds_taken_expand_
 
double seconds_taken_check_
 
double seconds_taken_indexes_
 

Detailed Description

This class enables you to do the compilation and optimization in one call, and also ensures that if the ComputationRequest is identical to the previous one, the compilation process is not repeated.

Definition at line 210 of file nnet-optimize.h.

Member Typedef Documentation

typedef std::list<const ComputationRequest*> AqType
private

Definition at line 273 of file nnet-optimize.h.

typedef unordered_map<const ComputationRequest*, std::pair<const NnetComputation*, AqType::iterator>, ComputationRequestHasher, ComputationRequestPtrEqual> CacheType
private

Definition at line 282 of file nnet-optimize.h.

Constructor & Destructor Documentation

Note: nnet is retained as a const reference but opt_config is copied.

Definition at line 570 of file nnet-optimize.cc.

Definition at line 650 of file nnet-optimize.cc.

References CachingOptimizingCompiler::computation_cache_, KALDI_LOG, CachingOptimizingCompiler::seconds_taken_check_, CachingOptimizingCompiler::seconds_taken_compile_, CachingOptimizingCompiler::seconds_taken_expand_, CachingOptimizingCompiler::seconds_taken_indexes_, CachingOptimizingCompiler::seconds_taken_optimize_, and CachingOptimizingCompiler::seconds_taken_total_.

650  {
651  CacheType::const_iterator itr = computation_cache_.begin(),
652  end = computation_cache_.end();
653  for (; itr !=end; ++itr) {
654  delete itr->first;
655  delete itr->second.first;
656  }
657  if (seconds_taken_total_ > 0.0) {
658  std::ostringstream os;
659  double seconds_taken_misc = seconds_taken_total_ - seconds_taken_compile_
662  os << std::setprecision(3) << seconds_taken_total_
663  << " seconds taken in nnet3 compilation total (breakdown: "
664  << seconds_taken_compile_ << " compilation, "
665  << seconds_taken_optimize_ << " optimization, "
666  << seconds_taken_expand_ << " shortcut expansion, "
667  << seconds_taken_check_ << " checking, "
668  << seconds_taken_indexes_ << " computing indexes, "
669  << seconds_taken_misc << " misc.)";
670  KALDI_LOG << os.str();
671  // note: the leftover amount is misc things like hashing and == comparisons on
672  // computation-requests, and calling RequestIsDecomposable().
673  }
674 }
#define KALDI_LOG
Definition: kaldi-error.h:133

Member Function Documentation

const NnetComputation * Compile ( const ComputationRequest request)

Does the compilation and returns a const pointer to the result, which is owned by this class, not the caller.

It calls ComputeCudaIndexes() for you, because you wouldn't be able to do this on a const object.

Definition at line 676 of file nnet-optimize.cc.

References CachingOptimizingCompiler::CompileInternal(), Timer::Elapsed(), and CachingOptimizingCompiler::seconds_taken_total_.

Referenced by NnetLdaStatsAccumulator::AccStats(), NnetComputerFromEg::Compute(), NnetDiscriminativeComputeObjf::Compute(), NnetChainComputeProb::Compute(), NnetComputeProb::Compute(), DecodableNnetSimple::DoNnetComputation(), NnetChainTrainer::Train(), NnetDiscriminativeTrainer::Train(), NnetTrainer::Train(), kaldi::nnet3::UnitTestNnetModelDerivatives(), and kaldi::nnet3::UnitTestNnetOptimizeWithOptions().

677  {
678  Timer timer;
679  const NnetComputation *ans = CompileInternal(in_request);
680  seconds_taken_total_ += timer.Elapsed();
681  return ans;
682 }
const NnetComputation * CompileInternal(const ComputationRequest &request)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
const NnetComputation * CompileAndCache ( const ComputationRequest request)
private

Definition at line 700 of file nnet-optimize.cc.

References CachingOptimizingCompiler::CompileNoShortcut(), CachingOptimizingCompiler::CompileViaShortcut(), and CachingOptimizingCompiler::UpdateCache().

Referenced by CachingOptimizingCompiler::CompileInternal().

701  {
702  // we need to make a copy of ComputationRequest, because it's stored
703  // as the key in the cache, and we need to own the pointer.
704  ComputationRequest *request = new ComputationRequest(in_request);
705 
706  const NnetComputation *computation = CompileViaShortcut(*request);
707  if (computation == NULL)
708  computation = CompileNoShortcut(*request);
709  UpdateCache(request, computation);
710  return computation;
711 }
const NnetComputation * CompileNoShortcut(const ComputationRequest &request)
const NnetComputation * CompileViaShortcut(const ComputationRequest &request)
void UpdateCache(const ComputationRequest *request, const NnetComputation *computation)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
const NnetComputation * CompileInternal ( const ComputationRequest request)
private

Definition at line 684 of file nnet-optimize.cc.

References CachingOptimizingCompiler::CompileAndCache(), CachingOptimizingCompiler::computation_cache_, and CachingOptimizingCompiler::UpdateAccessQueue().

Referenced by CachingOptimizingCompiler::Compile(), and CachingOptimizingCompiler::CompileViaShortcut().

685  {
686  const NnetComputation *ans;
687  // find computation in the cache
688  CacheType::iterator cit = computation_cache_.find(&in_request);
689  if (cit == computation_cache_.end()) {
690  ans = CompileAndCache(in_request);
691  } else {
692  // if found, update access queue
693  const NnetComputation *computation = cit->second.first;
694  UpdateAccessQueue(cit);
695  ans = computation;
696  }
697  return ans;
698 }
const NnetComputation * CompileAndCache(const ComputationRequest &request)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
void UpdateAccessQueue(CacheType::iterator &cit)
const NnetComputation * CompileNoShortcut ( const ComputationRequest request)
private

Definition at line 714 of file nnet-optimize.cc.

References ComputationChecker::Check(), CheckComputationOptions::check_rewrite, NnetComputation::ComputeCudaIndexes(), Compiler::CreateComputation(), Timer::Elapsed(), kaldi::GetVerboseLevel(), KALDI_LOG, kaldi::nnet3::MaxOutputTimeInRequest(), CachingOptimizingCompiler::nnet_, kaldi::nnet2::NnetComputation(), CachingOptimizingCompiler::opt_config_, kaldi::nnet3::Optimize(), ComputationRequest::Print(), NnetComputation::Print(), CachingOptimizingCompiler::seconds_taken_check_, CachingOptimizingCompiler::seconds_taken_compile_, CachingOptimizingCompiler::seconds_taken_indexes_, and CachingOptimizingCompiler::seconds_taken_optimize_.

Referenced by CachingOptimizingCompiler::CompileAndCache().

715  {
716 
717  Compiler compiler(request, nnet_);
718  // note: 'opts' only contains 'output_debug_info', which is true by default.
719  // There may be situations where we'd prefer not to keep it, for speed.
720  CompilerOptions opts;
721  NnetComputation *computation = new NnetComputation;
722 
723  {
724  Timer timer;
725  compiler.CreateComputation(opts, computation);
726  seconds_taken_compile_ += timer.Elapsed();
727  }
728 
729  int32 verbose_cutoff = 4;
730  if (GetVerboseLevel() >= verbose_cutoff) {
731  std::ostringstream os1;
732  request.Print(os1);
733  KALDI_LOG << "Computation request is " << os1.str();
734  std::ostringstream os2;
735  computation->Print(os2, nnet_);
736  KALDI_LOG << "Generated computation is: " << os2.str();
737  }
738  { // some checking. Note: there may come a time when we might
739  // prefer to disable this checking.
740  Timer timer;
741  CheckComputationOptions check_config;
742  // we can do the rewrite check since it's before optimization.
743  check_config.check_rewrite = true;
744  ComputationChecker checker(check_config, nnet_, *computation);
745  checker.Check();
746  seconds_taken_check_ += timer.Elapsed();
747  }
748 
749  {
750  Timer timer;
752  MaxOutputTimeInRequest(request),
753  computation);
754  seconds_taken_optimize_ += timer.Elapsed();
755  }
756 
757 
758  if (GetVerboseLevel() >= verbose_cutoff) {
759  std::ostringstream os;
760  computation->Print(os, nnet_);
761  KALDI_LOG << "Optimized computation is: " << os.str();
762  }
763  { // check the computation again.
764  Timer timer;
765  CheckComputationOptions check_config;
766  ComputationChecker checker(check_config, nnet_, *computation);
767  checker.Check();
768  seconds_taken_check_ += timer.Elapsed();
769  }
770  {
771  Timer timer;
772  computation->ComputeCudaIndexes();
773  seconds_taken_indexes_ += timer.Elapsed();
774  }
775  return computation;
776 }
int32 GetVerboseLevel()
Definition: kaldi-error.h:69
int32 MaxOutputTimeInRequest(const ComputationRequest &request)
void Optimize(const NnetOptimizeOptions &config, const Nnet &nnet, int32 max_output_time_in_request, NnetComputation *computation)
This is the top-level function for optimizing a computation.
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
#define KALDI_LOG
Definition: kaldi-error.h:133
const NnetComputation * CompileViaShortcut ( const ComputationRequest request)
private

Definition at line 779 of file nnet-optimize.cc.

References CachingOptimizingCompiler::CompileInternal(), NnetComputation::ComputeCudaIndexes(), CachingOptimizingCompiler::config_, Timer::Elapsed(), kaldi::nnet3::ExpandComputation(), ComputationRequest::misc_info, CachingOptimizingCompiler::nnet_, kaldi::nnet2::NnetComputation(), kaldi::nnet3::RequestIsDecomposable(), CachingOptimizingCompiler::seconds_taken_expand_, CachingOptimizingCompiler::seconds_taken_indexes_, and CachingOptimizingCompilerOptions::use_shortcut.

Referenced by CachingOptimizingCompiler::CompileAndCache().

780  {
781  if (!config_.use_shortcut)
782  return NULL;
783 
784  int32 num_n_values;
785  ComputationRequest mini_request;
786  if (!RequestIsDecomposable(request, &mini_request, &num_n_values))
787  return NULL;
788 
789  // By invoking CompileInternal() on the mini request, we go through the same
790  // caching process as for any externally requested computation. [the only
791  // difference from Compile() is that it doesn't call the timer code; this
792  // avoids double-counting the time taken.] This pointer will not have to be
793  // deleted by this function; it's owned by the class, in the cache.
794  const NnetComputation *mini_computation = CompileInternal(mini_request);
795 
796  // note: by default we always create debug_info, even in regular compilation.
797  // (e.g. it defaults to true in CompilerOptions). If it really seems to be a
798  // significant overhead, we can revisit this at some point in future.
799  bool need_debug_info = true;
800 
801 
802  NnetComputation *ans = new NnetComputation();
803 
804  {
805  Timer timer;
806  ExpandComputation(nnet_, request.misc_info, *mini_computation,
807  need_debug_info, num_n_values, ans);
808  seconds_taken_expand_ += timer.Elapsed();
809  }
810  {
811  Timer timer;
812  ans->ComputeCudaIndexes();
813  seconds_taken_indexes_ += timer.Elapsed();
814  }
815  return ans;
816 }
bool RequestIsDecomposable(const ComputationRequest &request, ComputationRequest *mini_request, int32 *num_n_values)
This function, used in 'shortcut' compilation where we first compile a smaller computation with the s...
void ExpandComputation(const Nnet &nnet, const MiscComputationInfo &misc_info, const NnetComputation &computation, bool need_debug_info, int32 num_n_values, NnetComputation *expanded_computation)
This function is used in 'shortcut' compilation to expand a computation that has been compiled for ex...
const NnetComputation * CompileInternal(const ComputationRequest &request)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
CachingOptimizingCompilerOptions config_
void ReadCache ( std::istream &  is,
bool  binary 
)

Definition at line 599 of file nnet-optimize.cc.

References CachingOptimizingCompiler::access_queue_, ComputationChecker::Check(), CachingOptimizingCompiler::computation_cache_, Timer::Elapsed(), kaldi::nnet3::ExpectToken(), kaldi::GetVerboseLevel(), KALDI_ASSERT, CachingOptimizingCompiler::nnet_, kaldi::nnet2::NnetComputation(), CachingOptimizingCompiler::opt_config_, NnetOptimizeOptions::Read(), ComputationRequest::Read(), NnetComputation::Read(), kaldi::ReadBasicType(), CachingOptimizingCompiler::seconds_taken_check_, and CachingOptimizingCompiler::UpdateCache().

Referenced by NnetChainTrainer::NnetChainTrainer(), NnetDiscriminativeTrainer::NnetDiscriminativeTrainer(), and NnetTrainer::NnetTrainer().

599  {
600  NnetOptimizeOptions opt_config_cached;
601  opt_config_cached.Read(is, binary);
602  // we won't read cached computations if any optimize option has been changed.
603  bool read_cache = (opt_config_ == opt_config_cached);
604 
605  if (read_cache) {
606  int32 computation_cache_size;
607  ExpectToken(is, binary, "<ComputationCacheSize>");
608  ReadBasicType(is, binary, &computation_cache_size);
609  KALDI_ASSERT(computation_cache_size >= 0);
610  computation_cache_.clear();
611  access_queue_.clear();
612  ExpectToken(is, binary, "<ComputationCache>");
613  for (size_t c = 0; c < computation_cache_size; c++) {
614  ComputationRequest *request = new ComputationRequest();
615  request->Read(is, binary);
616  NnetComputation *computation = new NnetComputation();
617  computation->Read(is, binary);
618  if (GetVerboseLevel() >= 3) {
619  Timer timer;
620  CheckComputationOptions check_config;
621  ComputationChecker checker(check_config, nnet_, *computation);
622  checker.Check();
623  seconds_taken_check_ += timer.Elapsed();
624  }
625  UpdateCache(request, computation);
626  }
627  }
628 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
int32 GetVerboseLevel()
Definition: kaldi-error.h:69
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void UpdateCache(const ComputationRequest *request, const NnetComputation *computation)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
void UpdateAccessQueue ( CacheType::iterator &  cit)
private

Definition at line 642 of file nnet-optimize.cc.

References CachingOptimizingCompiler::access_queue_, CachingOptimizingCompiler::computation_cache_, and KALDI_ASSERT.

Referenced by CachingOptimizingCompiler::CompileInternal().

642  {
643  // exist, update access record by moving the accessed
644  // request to the end of the access queue
645  KALDI_ASSERT(cit != computation_cache_.end());
647  cit->second.second);
648 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void UpdateCache ( const ComputationRequest request,
const NnetComputation computation 
)
private

Definition at line 579 of file nnet-optimize.cc.

References CachingOptimizingCompiler::access_queue_, CachingOptimizingCompilerOptions::cache_capacity, CachingOptimizingCompiler::computation_cache_, CachingOptimizingCompiler::config_, and KALDI_ASSERT.

Referenced by CachingOptimizingCompiler::CompileAndCache(), and CachingOptimizingCompiler::ReadCache().

580  {
581  if (computation_cache_.size() == config_.cache_capacity) {
582  // full, locate the least-recently-accessed request
583  const CacheType::iterator it =
584  computation_cache_.find(access_queue_.front());
585  KALDI_ASSERT(it != computation_cache_.end());
586  // purge the least-recently-accessed request
587  const ComputationRequest *r = it->first;
588  const NnetComputation *c = it->second.first;
589  computation_cache_.erase(it);
590  delete r;
591  delete c;
592  access_queue_.pop_front();
593  }
594  AqType::iterator ait = access_queue_.insert(access_queue_.end(), request);
595  computation_cache_.insert(std::make_pair(request,
596  std::make_pair(computation, ait)));
597 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
CachingOptimizingCompilerOptions config_
void WriteCache ( std::ostream &  os,
bool  binary 
) const

Definition at line 630 of file nnet-optimize.cc.

References CachingOptimizingCompiler::computation_cache_, CachingOptimizingCompiler::opt_config_, NnetOptimizeOptions::Write(), kaldi::WriteBasicType(), and kaldi::WriteToken().

Referenced by NnetChainTrainer::~NnetChainTrainer(), NnetDiscriminativeTrainer::~NnetDiscriminativeTrainer(), and NnetTrainer::~NnetTrainer().

630  {
631  opt_config_.Write(os, binary);
632  WriteToken(os, binary, "<ComputationCacheSize>");
633  WriteBasicType(os, binary, static_cast<int32>(computation_cache_.size()));
634  WriteToken(os, binary, "<ComputationCache>");
635  for (CacheType::const_iterator iter = computation_cache_.begin();
636  iter != computation_cache_.end(); ++iter) {
637  iter->first->Write(os, binary);
638  iter->second.first->Write(os, binary);
639  }
640 }
void Write(std::ostream &os, bool binary) const
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation

double seconds_taken_compile_
private
double seconds_taken_expand_
private
double seconds_taken_optimize_
private
double seconds_taken_total_
private

The documentation for this class was generated from the following files: