All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
CachingOptimizingCompiler Class Reference

This class enables you to do the compilation and optimization in one call, and also ensures that if the ComputationRequest is identical to the previous one, the compilation process is not repeated. More...

#include <nnet-optimize.h>

Collaboration diagram for CachingOptimizingCompiler:

Public Member Functions

 CachingOptimizingCompiler (const Nnet &nnet, const CachingOptimizingCompilerOptions config=CachingOptimizingCompilerOptions())
 
 CachingOptimizingCompiler (const Nnet &nnet, const NnetOptimizeOptions &opt_config, const CachingOptimizingCompilerOptions config=CachingOptimizingCompilerOptions())
 Note: nnet is retained as a const reference but opt_config is copied. More...
 
 ~CachingOptimizingCompiler ()
 
const NnetComputationCompile (const ComputationRequest &request)
 Does the compilation and returns a const pointer to the result, which is owned by this class, not the caller. More...
 
void ReadCache (std::istream &is, bool binary)
 
void WriteCache (std::ostream &os, bool binary) const
 

Private Types

typedef std::list< const
ComputationRequest * > 
AqType
 
typedef unordered_map< const
ComputationRequest
*, std::pair< const
NnetComputation
*, AqType::iterator >
, ComputationRequestHasher,
ComputationRequestPtrEqual
CacheType
 

Private Member Functions

const NnetComputationCompileInternal (const ComputationRequest &request)
 
const NnetComputationCompileAndCache (const ComputationRequest &request)
 
const NnetComputationCompileViaShortcut (const ComputationRequest &request)
 
const NnetComputationCompileNoShortcut (const ComputationRequest &request)
 
void UpdateCache (const ComputationRequest *request, const NnetComputation *computation)
 
void UpdateAccessQueue (CacheType::iterator &cit)
 

Private Attributes

const Nnetnnet_
 
CachingOptimizingCompilerOptions config_
 
NnetOptimizeOptions opt_config_
 
AqType access_queue_
 
CacheType computation_cache_
 
double seconds_taken_total_
 
double seconds_taken_compile_
 
double seconds_taken_optimize_
 
double seconds_taken_expand_
 
double seconds_taken_check_
 
double seconds_taken_indexes_
 

Detailed Description

This class enables you to do the compilation and optimization in one call, and also ensures that if the ComputationRequest is identical to the previous one, the compilation process is not repeated.

Definition at line 210 of file nnet-optimize.h.

Member Typedef Documentation

typedef std::list<const ComputationRequest*> AqType
private

Definition at line 273 of file nnet-optimize.h.

typedef unordered_map<const ComputationRequest*, std::pair<const NnetComputation*, AqType::iterator>, ComputationRequestHasher, ComputationRequestPtrEqual> CacheType
private

Definition at line 282 of file nnet-optimize.h.

Constructor & Destructor Documentation

Note: nnet is retained as a const reference but opt_config is copied.

Definition at line 614 of file nnet-optimize.cc.

Definition at line 694 of file nnet-optimize.cc.

References CachingOptimizingCompiler::computation_cache_, KALDI_LOG, CachingOptimizingCompiler::seconds_taken_check_, CachingOptimizingCompiler::seconds_taken_compile_, CachingOptimizingCompiler::seconds_taken_expand_, CachingOptimizingCompiler::seconds_taken_indexes_, CachingOptimizingCompiler::seconds_taken_optimize_, and CachingOptimizingCompiler::seconds_taken_total_.

694  {
695  CacheType::const_iterator itr = computation_cache_.begin(),
696  end = computation_cache_.end();
697  for (; itr !=end; ++itr) {
698  delete itr->first;
699  delete itr->second.first;
700  }
701  if (seconds_taken_total_ > 0.0) {
702  std::ostringstream os;
703  double seconds_taken_misc = seconds_taken_total_ - seconds_taken_compile_
706  os << std::setprecision(3) << seconds_taken_total_
707  << " seconds taken in nnet3 compilation total (breakdown: "
708  << seconds_taken_compile_ << " compilation, "
709  << seconds_taken_optimize_ << " optimization, "
710  << seconds_taken_expand_ << " shortcut expansion, "
711  << seconds_taken_check_ << " checking, "
712  << seconds_taken_indexes_ << " computing indexes, "
713  << seconds_taken_misc << " misc.)";
714  KALDI_LOG << os.str();
715  // note: the leftover amount is misc things like hashing and == comparisons on
716  // computation-requests, and calling RequestIsDecomposable().
717  }
718 }
#define KALDI_LOG
Definition: kaldi-error.h:133

Member Function Documentation

const NnetComputation * Compile ( const ComputationRequest request)

Does the compilation and returns a const pointer to the result, which is owned by this class, not the caller.

It calls ComputeCudaIndexes() for you, because you wouldn't be able to do this on a const object.

Definition at line 720 of file nnet-optimize.cc.

References CachingOptimizingCompiler::CompileInternal(), Timer::Elapsed(), and CachingOptimizingCompiler::seconds_taken_total_.

Referenced by NnetLdaStatsAccumulator::AccStats(), NnetComputerFromEg::Compute(), NnetDiscriminativeComputeObjf::Compute(), NnetChainComputeProb::Compute(), NnetComputeProb::Compute(), DecodableNnetSimple::DoNnetComputation(), kaldi::nnet3::RunNnetComputation(), NnetChainTrainer::Train(), NnetDiscriminativeTrainer::Train(), NnetTrainer::Train(), kaldi::nnet3::UnitTestNnetModelDerivatives(), and kaldi::nnet3::UnitTestNnetOptimizeWithOptions().

721  {
722  Timer timer;
723  const NnetComputation *ans = CompileInternal(in_request);
724  seconds_taken_total_ += timer.Elapsed();
725  return ans;
726 }
const NnetComputation * CompileInternal(const ComputationRequest &request)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
const NnetComputation * CompileAndCache ( const ComputationRequest request)
private

Definition at line 744 of file nnet-optimize.cc.

References CachingOptimizingCompiler::CompileNoShortcut(), CachingOptimizingCompiler::CompileViaShortcut(), and CachingOptimizingCompiler::UpdateCache().

Referenced by CachingOptimizingCompiler::CompileInternal().

745  {
746  // we need to make a copy of ComputationRequest, because it's stored
747  // as the key in the cache, and we need to own the pointer.
748  ComputationRequest *request = new ComputationRequest(in_request);
749 
750  const NnetComputation *computation = CompileViaShortcut(*request);
751  if (computation == NULL)
752  computation = CompileNoShortcut(*request);
753  UpdateCache(request, computation);
754  return computation;
755 }
const NnetComputation * CompileNoShortcut(const ComputationRequest &request)
const NnetComputation * CompileViaShortcut(const ComputationRequest &request)
void UpdateCache(const ComputationRequest *request, const NnetComputation *computation)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
const NnetComputation * CompileInternal ( const ComputationRequest request)
private

Definition at line 728 of file nnet-optimize.cc.

References CachingOptimizingCompiler::CompileAndCache(), CachingOptimizingCompiler::computation_cache_, and CachingOptimizingCompiler::UpdateAccessQueue().

Referenced by CachingOptimizingCompiler::Compile(), and CachingOptimizingCompiler::CompileViaShortcut().

729  {
730  const NnetComputation *ans;
731  // find computation in the cache
732  CacheType::iterator cit = computation_cache_.find(&in_request);
733  if (cit == computation_cache_.end()) {
734  ans = CompileAndCache(in_request);
735  } else {
736  // if found, update access queue
737  const NnetComputation *computation = cit->second.first;
738  UpdateAccessQueue(cit);
739  ans = computation;
740  }
741  return ans;
742 }
const NnetComputation * CompileAndCache(const ComputationRequest &request)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
void UpdateAccessQueue(CacheType::iterator &cit)
const NnetComputation * CompileNoShortcut ( const ComputationRequest request)
private

Definition at line 758 of file nnet-optimize.cc.

References ComputationChecker::Check(), CheckComputationOptions::check_rewrite, NnetComputation::ComputeCudaIndexes(), Compiler::CreateComputation(), Timer::Elapsed(), kaldi::GetVerboseLevel(), KALDI_LOG, kaldi::nnet3::MaxOutputTimeInRequest(), CachingOptimizingCompiler::nnet_, kaldi::nnet2::NnetComputation(), CachingOptimizingCompiler::opt_config_, kaldi::nnet3::Optimize(), ComputationRequest::Print(), NnetComputation::Print(), CachingOptimizingCompiler::seconds_taken_check_, CachingOptimizingCompiler::seconds_taken_compile_, CachingOptimizingCompiler::seconds_taken_indexes_, and CachingOptimizingCompiler::seconds_taken_optimize_.

Referenced by CachingOptimizingCompiler::CompileAndCache().

759  {
760 
761  Compiler compiler(request, nnet_);
762  // note: 'opts' only contains 'output_debug_info', which is true by default.
763  // There may be situations where we'd prefer not to keep it, for speed.
764  CompilerOptions opts;
765  NnetComputation *computation = new NnetComputation;
766 
767  {
768  Timer timer;
769  compiler.CreateComputation(opts, computation);
770  seconds_taken_compile_ += timer.Elapsed();
771  }
772 
773  int32 verbose_cutoff = 4;
774  if (GetVerboseLevel() >= verbose_cutoff) {
775  std::ostringstream os1;
776  request.Print(os1);
777  KALDI_LOG << "Computation request is " << os1.str();
778  std::ostringstream os2;
779  computation->Print(os2, nnet_);
780  KALDI_LOG << "Generated computation is: " << os2.str();
781  }
782 
783  { // some checking. Note: there may come a time when we might
784  // prefer to disable this checking.
785  Timer timer;
786  CheckComputationOptions check_config;
787  // we can do the rewrite check since it's before optimization.
788  check_config.check_rewrite = true;
789  ComputationChecker checker(check_config, nnet_, *computation);
790  checker.Check();
791  seconds_taken_check_ += timer.Elapsed();
792  }
793 
794  {
795  Timer timer;
797  MaxOutputTimeInRequest(request),
798  computation);
799  seconds_taken_optimize_ += timer.Elapsed();
800  }
801 
802 
803  if (GetVerboseLevel() >= verbose_cutoff) {
804  std::ostringstream os;
805  computation->Print(os, nnet_);
806  KALDI_LOG << "Optimized computation is: " << os.str();
807  }
808  { // check the computation again.
809  Timer timer;
810  CheckComputationOptions check_config;
811  ComputationChecker checker(check_config, nnet_, *computation);
812  checker.Check();
813  seconds_taken_check_ += timer.Elapsed();
814  }
815  {
816  Timer timer;
817  computation->ComputeCudaIndexes();
818  seconds_taken_indexes_ += timer.Elapsed();
819  }
820  return computation;
821 }
int32 GetVerboseLevel()
Definition: kaldi-error.h:69
int32 MaxOutputTimeInRequest(const ComputationRequest &request)
void Optimize(const NnetOptimizeOptions &config, const Nnet &nnet, int32 max_output_time_in_request, NnetComputation *computation)
This is the top-level function for optimizing a computation.
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
#define KALDI_LOG
Definition: kaldi-error.h:133
const NnetComputation * CompileViaShortcut ( const ComputationRequest request)
private

Definition at line 824 of file nnet-optimize.cc.

References kaldi::nnet3::CheckComputation(), CachingOptimizingCompiler::CompileInternal(), NnetComputation::ComputeCudaIndexes(), CachingOptimizingCompiler::config_, Timer::Elapsed(), kaldi::nnet3::ExpandComputation(), kaldi::GetVerboseLevel(), ComputationRequest::misc_info, CachingOptimizingCompiler::nnet_, kaldi::nnet2::NnetComputation(), kaldi::nnet3::RequestIsDecomposable(), CachingOptimizingCompiler::seconds_taken_expand_, CachingOptimizingCompiler::seconds_taken_indexes_, and CachingOptimizingCompilerOptions::use_shortcut.

Referenced by CachingOptimizingCompiler::CompileAndCache().

825  {
826  if (!config_.use_shortcut)
827  return NULL;
828 
829  int32 num_n_values;
830  ComputationRequest mini_request;
831  if (!RequestIsDecomposable(request, &mini_request, &num_n_values))
832  return NULL;
833 
834  // By invoking CompileInternal() on the mini request, we go through the same
835  // caching process as for any externally requested computation. [the only
836  // difference from Compile() is that it doesn't call the timer code; this
837  // avoids double-counting the time taken.] This pointer will not have to be
838  // deleted by this function; it's owned by the class, in the cache.
839  const NnetComputation *mini_computation = CompileInternal(mini_request);
840 
841  // note: by default we always create debug_info, even in regular compilation.
842  // (e.g. it defaults to true in CompilerOptions). If it really seems to be a
843  // significant overhead, we can revisit this at some point in future.
844  bool need_debug_info = true;
845 
846 
847  NnetComputation *ans = new NnetComputation();
848 
849  {
850  Timer timer;
851  ExpandComputation(nnet_, request.misc_info, *mini_computation,
852  need_debug_info, num_n_values, ans);
853  seconds_taken_expand_ += timer.Elapsed();
854  }
855  if (GetVerboseLevel() >= 3) {
856  CheckComputation(nnet_, *ans, false);
857  }
858 
859  {
860  Timer timer;
861  ans->ComputeCudaIndexes();
862  seconds_taken_indexes_ += timer.Elapsed();
863  }
864  return ans;
865 }
int32 GetVerboseLevel()
Definition: kaldi-error.h:69
bool RequestIsDecomposable(const ComputationRequest &request, ComputationRequest *mini_request, int32 *num_n_values)
This function, used in 'shortcut' compilation where we first compile a smaller computation with the s...
void ExpandComputation(const Nnet &nnet, const MiscComputationInfo &misc_info, const NnetComputation &computation, bool need_debug_info, int32 num_n_values, NnetComputation *expanded_computation)
This function is used in 'shortcut' compilation to expand a computation that has been compiled for ex...
void CheckComputation(const Nnet &nnet, const NnetComputation &computation, bool check_rewrite)
This is a convenience interface for class ComputationChecker.
const NnetComputation * CompileInternal(const ComputationRequest &request)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
CachingOptimizingCompilerOptions config_
void ReadCache ( std::istream &  is,
bool  binary 
)

Definition at line 643 of file nnet-optimize.cc.

References CachingOptimizingCompiler::access_queue_, ComputationChecker::Check(), CachingOptimizingCompiler::computation_cache_, Timer::Elapsed(), kaldi::nnet3::ExpectToken(), kaldi::GetVerboseLevel(), KALDI_ASSERT, CachingOptimizingCompiler::nnet_, kaldi::nnet2::NnetComputation(), CachingOptimizingCompiler::opt_config_, NnetOptimizeOptions::Read(), ComputationRequest::Read(), NnetComputation::Read(), kaldi::ReadBasicType(), CachingOptimizingCompiler::seconds_taken_check_, and CachingOptimizingCompiler::UpdateCache().

Referenced by NnetChainTrainer::NnetChainTrainer(), NnetDiscriminativeTrainer::NnetDiscriminativeTrainer(), and NnetTrainer::NnetTrainer().

643  {
644  NnetOptimizeOptions opt_config_cached;
645  opt_config_cached.Read(is, binary);
646  // we won't read cached computations if any optimize option has been changed.
647  bool read_cache = (opt_config_ == opt_config_cached);
648 
649  if (read_cache) {
650  int32 computation_cache_size;
651  ExpectToken(is, binary, "<ComputationCacheSize>");
652  ReadBasicType(is, binary, &computation_cache_size);
653  KALDI_ASSERT(computation_cache_size >= 0);
654  computation_cache_.clear();
655  access_queue_.clear();
656  ExpectToken(is, binary, "<ComputationCache>");
657  for (size_t c = 0; c < computation_cache_size; c++) {
658  ComputationRequest *request = new ComputationRequest();
659  request->Read(is, binary);
660  NnetComputation *computation = new NnetComputation();
661  computation->Read(is, binary);
662  if (GetVerboseLevel() >= 3) {
663  Timer timer;
664  CheckComputationOptions check_config;
665  ComputationChecker checker(check_config, nnet_, *computation);
666  checker.Check();
667  seconds_taken_check_ += timer.Elapsed();
668  }
669  UpdateCache(request, computation);
670  }
671  }
672 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
int32 GetVerboseLevel()
Definition: kaldi-error.h:69
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void UpdateCache(const ComputationRequest *request, const NnetComputation *computation)
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
void UpdateAccessQueue ( CacheType::iterator &  cit)
private

Definition at line 686 of file nnet-optimize.cc.

References CachingOptimizingCompiler::access_queue_, CachingOptimizingCompiler::computation_cache_, and KALDI_ASSERT.

Referenced by CachingOptimizingCompiler::CompileInternal().

686  {
687  // exist, update access record by moving the accessed
688  // request to the end of the access queue
689  KALDI_ASSERT(cit != computation_cache_.end());
691  cit->second.second);
692 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void UpdateCache ( const ComputationRequest request,
const NnetComputation computation 
)
private

Definition at line 623 of file nnet-optimize.cc.

References CachingOptimizingCompiler::access_queue_, CachingOptimizingCompilerOptions::cache_capacity, CachingOptimizingCompiler::computation_cache_, CachingOptimizingCompiler::config_, and KALDI_ASSERT.

Referenced by CachingOptimizingCompiler::CompileAndCache(), and CachingOptimizingCompiler::ReadCache().

624  {
625  if (computation_cache_.size() == config_.cache_capacity) {
626  // full, locate the least-recently-accessed request
627  const CacheType::iterator it =
628  computation_cache_.find(access_queue_.front());
629  KALDI_ASSERT(it != computation_cache_.end());
630  // purge the least-recently-accessed request
631  const ComputationRequest *r = it->first;
632  const NnetComputation *c = it->second.first;
633  computation_cache_.erase(it);
634  delete r;
635  delete c;
636  access_queue_.pop_front();
637  }
638  AqType::iterator ait = access_queue_.insert(access_queue_.end(), request);
639  computation_cache_.insert(std::make_pair(request,
640  std::make_pair(computation, ait)));
641 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
CachingOptimizingCompilerOptions config_
void WriteCache ( std::ostream &  os,
bool  binary 
) const

Definition at line 674 of file nnet-optimize.cc.

References CachingOptimizingCompiler::computation_cache_, CachingOptimizingCompiler::opt_config_, NnetOptimizeOptions::Write(), kaldi::WriteBasicType(), and kaldi::WriteToken().

Referenced by NnetChainTrainer::~NnetChainTrainer(), NnetDiscriminativeTrainer::~NnetDiscriminativeTrainer(), and NnetTrainer::~NnetTrainer().

674  {
675  opt_config_.Write(os, binary);
676  WriteToken(os, binary, "<ComputationCacheSize>");
677  WriteBasicType(os, binary, static_cast<int32>(computation_cache_.size()));
678  WriteToken(os, binary, "<ComputationCache>");
679  for (CacheType::const_iterator iter = computation_cache_.begin();
680  iter != computation_cache_.end(); ++iter) {
681  iter->first->Write(os, binary);
682  iter->second.first->Write(os, binary);
683  }
684 }
void Write(std::ostream &os, bool binary) const
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation

double seconds_taken_compile_
private
double seconds_taken_expand_
private
double seconds_taken_optimize_
private
double seconds_taken_total_
private

The documentation for this class was generated from the following files: