LstmNonlinearityComponent Class Reference

#include <nnet-combined-component.h>

Inheritance diagram for LstmNonlinearityComponent:
Collaboration diagram for LstmNonlinearityComponent:

Public Member Functions

virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
 LstmNonlinearityComponent ()
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update_in, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void PerturbParams (BaseFloat stddev)
 This function is to be used in testing. More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Computes dot-product between parameters of two instances of a Component. More...
 
virtual int32 NumParameters () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual void FreezeNaturalGradient (bool freeze)
 virtual More...
 
 LstmNonlinearityComponent (const LstmNonlinearityComponent &other)
 
void Init (int32 cell_dim, bool use_dropout, BaseFloat param_stddev, BaseFloat tanh_self_repair_threshold, BaseFloat sigmoid_self_repair_threshold, BaseFloat self_repair_scale)
 
virtual void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
virtual void SetUnderlyingLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More...
 
virtual void SetActualLearningRate (BaseFloat lrate)
 Sets the learning rate directly, bypassing learning_rate_factor_. More...
 
virtual void SetAsGradient ()
 Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More...
 
virtual BaseFloat LearningRateFactor ()
 
virtual void SetLearningRateFactor (BaseFloat lrate_factor)
 
void SetUpdatableConfigs (const UpdatableComponent &other)
 
BaseFloat LearningRate () const
 Gets the learning rate to be used in gradient descent. More...
 
BaseFloat MaxChange () const
 Returns the per-component max-change value, which is interpreted as the maximum change (in l2 norm) in parameters that is allowed per minibatch for this component. More...
 
void SetMaxChange (BaseFloat max_change)
 
BaseFloat L2Regularization () const
 Returns the l2 regularization constant, which may be set in any updatable component (usually from the config file). More...
 
void SetL2Regularization (BaseFloat a)
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 

Private Member Functions

void InitNaturalGradient ()
 
const LstmNonlinearityComponentoperator= (const LstmNonlinearityComponent &other)
 

Private Attributes

CuMatrix< BaseFloatparams_
 
bool use_dropout_
 
CuMatrix< double > value_sum_
 
CuMatrix< double > deriv_sum_
 
CuVector< BaseFloatself_repair_config_
 
CuVector< double > self_repair_total_
 
double count_
 
OnlineNaturalGradient preconditioner_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from UpdatableComponent
void InitLearningRatesFromConfig (ConfigLine *cfl)
 
std::string ReadUpdatableCommon (std::istream &is, bool binary)
 
void WriteUpdatableCommon (std::ostream &is, bool binary) const
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (typically 0.0..0.01) More...
 
BaseFloat learning_rate_factor_
 learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More...
 
BaseFloat l2_regularize_
 L2 regularization constant. More...
 
bool is_gradient_
 True if this component is to be treated as a gradient rather than as parameters. More...
 
BaseFloat max_change_
 configuration value for imposing max-change More...
 

Detailed Description

Definition at line 335 of file nnet-combined-component.h.

Constructor & Destructor Documentation

◆ LstmNonlinearityComponent() [1/2]

◆ LstmNonlinearityComponent() [2/2]

Definition at line 1230 of file nnet-combined-component.cc.

1231  :
1232  UpdatableComponent(other),
1233  params_(other.params_),
1234  use_dropout_(other.use_dropout_),
1235  value_sum_(other.value_sum_),
1236  deriv_sum_(other.deriv_sum_),
1237  self_repair_config_(other.self_repair_config_),
1238  self_repair_total_(other.self_repair_total_),
1239  count_(other.count_),
1240  preconditioner_(other.preconditioner_) { }

Member Function Documentation

◆ Add()

void Add ( BaseFloat  alpha,
const Component other 
)
virtual

This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters.

– a NonlinearComponent (or another component that stores stats, like BatchNormComponent)– it relates to adding stats. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 1128 of file nnet-combined-component.cc.

References LstmNonlinearityComponent::count_, LstmNonlinearityComponent::deriv_sum_, KALDI_ASSERT, LstmNonlinearityComponent::params_, LstmNonlinearityComponent::self_repair_total_, and LstmNonlinearityComponent::value_sum_.

1129  {
1130  const LstmNonlinearityComponent *other =
1131  dynamic_cast<const LstmNonlinearityComponent*>(&other_in);
1132  KALDI_ASSERT(other != NULL);
1133  params_.AddMat(alpha, other->params_);
1134  value_sum_.AddMat(alpha, other->value_sum_);
1135  deriv_sum_.AddMat(alpha, other->deriv_sum_);
1136  self_repair_total_.AddVec(alpha, other->self_repair_total_);
1137  count_ += alpha * other->count_;
1138 }
void AddMat(Real alpha, const CuMatrixBase< Real > &A, MatrixTransposeType trans=kNoTrans)
*this += alpha * A
Definition: cu-matrix.cc:954
void AddVec(Real alpha, const CuVectorBase< Real > &vec, Real beta=1.0)
Definition: cu-vector.cc:1237
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Backprop()

void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 1180 of file nnet-combined-component.cc.

References CuVectorBase< Real >::AddColSumMat(), CuVectorBase< Real >::AddVec(), kaldi::cu::BackpropLstmNonlinearity(), LstmNonlinearityComponent::count_, LstmNonlinearityComponent::deriv_sum_, UpdatableComponent::is_gradient_, KALDI_ASSERT, kaldi::kUndefined, UpdatableComponent::learning_rate_, CuMatrixBase< Real >::NumRows(), NVTX_RANGE, LstmNonlinearityComponent::params_, OnlineNaturalGradient::PreconditionDirections(), LstmNonlinearityComponent::preconditioner_, LstmNonlinearityComponent::self_repair_total_, and LstmNonlinearityComponent::value_sum_.

1188  {
1189  NVTX_RANGE("LstmNonlinearityComponent::Backprop");
1190 
1191  if (to_update_in == NULL) {
1192  cu::BackpropLstmNonlinearity(in_value, params_, out_deriv,
1194  count_, in_deriv,
1195  (CuMatrixBase<BaseFloat>*) NULL,
1196  (CuMatrixBase<double>*) NULL,
1197  (CuMatrixBase<double>*) NULL,
1198  (CuMatrixBase<BaseFloat>*) NULL);
1199  } else {
1200  LstmNonlinearityComponent *to_update =
1201  dynamic_cast<LstmNonlinearityComponent*>(to_update_in);
1202  KALDI_ASSERT(to_update != NULL);
1203 
1204  int32 cell_dim = params_.NumCols();
1205  CuMatrix<BaseFloat> params_deriv(3, cell_dim, kUndefined);
1206  CuMatrix<BaseFloat> self_repair_total(5, cell_dim, kUndefined);
1207 
1208  cu::BackpropLstmNonlinearity(in_value, params_, out_deriv,
1210  count_, in_deriv, &params_deriv,
1211  &(to_update->value_sum_),
1212  &(to_update->deriv_sum_),
1213  &self_repair_total);
1214 
1215  CuVector<BaseFloat> self_repair_total_sum(5);
1216  self_repair_total_sum.AddColSumMat(1.0, self_repair_total, 0.0);
1217  to_update->self_repair_total_.AddVec(1.0, self_repair_total_sum);
1218  to_update->count_ += static_cast<double>(in_value.NumRows());
1219 
1220  BaseFloat scale = 1.0;
1221  if (!to_update->is_gradient_) {
1222  to_update->preconditioner_.PreconditionDirections(
1223  &params_deriv, &scale);
1224  }
1225  to_update->params_.AddMat(to_update->learning_rate_ * scale,
1226  params_deriv);
1227  }
1228 }
kaldi::int32 int32
void BackpropLstmNonlinearity(const CuMatrixBase< Real > &input, const CuMatrixBase< Real > &params, const CuMatrixBase< Real > &output_deriv, const CuMatrixBase< double > &deriv_sum_in, const CuVectorBase< Real > &self_repair_config, double count_in, CuMatrixBase< Real > *input_deriv, CuMatrixBase< Real > *params_deriv, CuMatrixBase< double > *value_sum_out, CuMatrixBase< double > *deriv_sum_out, CuMatrixBase< Real > *self_repair_sum_out)
This function does the &#39;backward&#39; pass corresponding to the function ComputeLstmNonlinearity.
Definition: cu-math.cc:768
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define NVTX_RANGE(name)
Definition: cu-common.h:143

◆ ConsolidateMemory()

void ConsolidateMemory ( )
virtual

This virtual function relates to memory management, and avoiding fragmentation.

It is called only once per model, after we do the first minibatch of training. The default implementation does nothing, but it can be overridden by child classes, where it may re-initialize certain quantities that may possibly have been allocated during the forward pass (e.g. certain statistics; OnlineNaturalGradient objects). We use our own CPU-based allocator (see cu-allocator.h) and since it can't do paging since we're not in control of the GPU page table, fragmentation can be a problem. The allocator always tries to put things in 'low-address memory' (i.e. at smaller memory addresses) near the beginning of the block it allocated, to avoid fragmentation; but if permanent things (belonging to the model) are allocated in the forward pass, they can permanently stay in high memory. This function helps to prevent that, by re-allocating those things into low-address memory (It's important that it's called after all the temporary buffers for the forward-backward have been freed, so that there is low-address memory available)).

Reimplemented from Component.

Definition at line 1331 of file nnet-combined-component.cc.

References CuVectorBase< Real >::Add(), CuMatrixBase< Real >::Add(), CuMatrixBase< Real >::AddMat(), CuMatrixBase< Real >::AddMatDiagVec(), CuMatrixBase< Real >::AddMatMat(), CuMatrixBase< Real >::AddMatMatElements(), CuVectorBase< Real >::AddRowSumMat(), CuVectorBase< Real >::AddVec(), CuMatrixBase< Real >::ApplyHeaviside(), CuMatrixBase< Real >::ApplyPow(), CuMatrixBase< Real >::CopyFromMat(), VectorBase< Real >::CopyFromVec(), VectorBase< Real >::CopyRowsFromMat(), LstmNonlinearityComponent::count_, LstmNonlinearityComponent::deriv_sum_, CuMatrixBase< Real >::DiffTanh(), VectorBase< Real >::Dim(), CuMatrixBase< Real >::Dim(), kaldi::nnet3::DotProduct(), kaldi::nnet3::ExpectToken(), OnlineNaturalGradient::Freeze(), kaldi::nnet3::FreezeNaturalGradient(), OnlineNaturalGradient::GetAlpha(), OnlineNaturalGradient::GetRank(), OnlineNaturalGradient::GetUpdatePeriod(), ConfigLine::GetValue(), UpdatableComponent::Info(), UpdatableComponent::InitLearningRatesFromConfig(), LstmNonlinearityComponent::InputDim(), UpdatableComponent::is_gradient_, KALDI_ASSERT, KALDI_ERR, kaldi::kNoTrans, kaldi::kTrans, kaldi::kUndefined, UpdatableComponent::learning_rate_, CuMatrixBase< Real >::MulColsVec(), CuMatrixBase< Real >::MulElements(), CuMatrixBase< Real >::NumCols(), kaldi::nnet3::NumParameters(), LstmNonlinearityComponent::NumParameters(), CuMatrixBase< Real >::NumRows(), NVTX_RANGE, LstmNonlinearityComponent::OutputDim(), kaldi::nnet3::PerturbParams(), OnlineNaturalGradient::PreconditionDirections(), LstmNonlinearityComponent::preconditioner_, kaldi::nnet3::PrintParameterStats(), kaldi::RandUniform(), CuMatrix< Real >::Read(), kaldi::ReadBasicType(), UpdatableComponent::ReadUpdatableCommon(), CuMatrix< Real >::Resize(), kaldi::SameDim(), VectorBase< Real >::Scale(), CuMatrixBase< Real >::Scale(), LstmNonlinearityComponent::self_repair_total_, OnlineNaturalGradient::SetAlpha(), CuVectorBase< Real >::SetRandn(), CuMatrixBase< Real >::SetRandn(), OnlineNaturalGradient::SetRank(), OnlineNaturalGradient::SetUpdatePeriod(), CuMatrixBase< Real >::SetZero(), CuVectorBase< Real >::Sum(), kaldi::nnet3::SummarizeVector(), OnlineNaturalGradient::Swap(), CuMatrixBase< Real >::Tanh(), kaldi::TraceMatMat(), UpdatableComponent::UpdatableComponent(), LstmNonlinearityComponent::value_sum_, kaldi::VecVec(), VectorBase< Real >::Write(), kaldi::WriteBasicType(), kaldi::WriteToken(), and UpdatableComponent::WriteUpdatableCommon().

1331  {
1332  OnlineNaturalGradient preconditioner_temp(preconditioner_);
1334 }
void Swap(OnlineNaturalGradient *other)

◆ Copy()

Component * Copy ( ) const
virtual

Copies component (deep copy).

Implements Component.

Definition at line 1101 of file nnet-combined-component.cc.

1101  {
1102  return new LstmNonlinearityComponent(*this);
1103 }

◆ DotProduct()

BaseFloat DotProduct ( const UpdatableComponent other) const
virtual

Computes dot-product between parameters of two instances of a Component.

Can be used for computing parameter-norm of an UpdatableComponent.

Implements UpdatableComponent.

Definition at line 1146 of file nnet-combined-component.cc.

References KALDI_ASSERT, kaldi::kTrans, LstmNonlinearityComponent::params_, and kaldi::TraceMatMat().

1147  {
1148  const LstmNonlinearityComponent *other =
1149  dynamic_cast<const LstmNonlinearityComponent*>(&other_in);
1150  KALDI_ASSERT(other != NULL);
1151  return TraceMatMat(params_, other->params_, kTrans);
1152 }
Real TraceMatMat(const MatrixBase< Real > &A, const MatrixBase< Real > &B, MatrixTransposeType trans)
We need to declare this here as it will be a friend function.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ FreezeNaturalGradient()

void FreezeNaturalGradient ( bool  freeze)
virtual

virtual

Reimplemented from UpdatableComponent.

Definition at line 1286 of file nnet-combined-component.cc.

References OnlineNaturalGradient::Freeze(), and LstmNonlinearityComponent::preconditioner_.

1286  {
1287  preconditioner_.Freeze(freeze);
1288 }

◆ Info()

std::string Info ( ) const
virtual

Returns some text-form information about this component, for diagnostics.

Starts with the type of the component. E.g. "SigmoidComponent dim=900", although most components will have much more info.

Reimplemented from UpdatableComponent.

Definition at line 1061 of file nnet-combined-component.cc.

References rnnlm::i, UpdatableComponent::Info(), kaldi::nnet3::PrintParameterStats(), VectorBase< Real >::Scale(), and kaldi::nnet3::SummarizeVector().

1061  {
1062  std::ostringstream stream;
1063  int32 cell_dim = params_.NumCols();
1064  stream << UpdatableComponent::Info() << ", cell-dim=" << cell_dim
1065  << ", use-dropout=" << (use_dropout_ ? "true" : "false");
1066  PrintParameterStats(stream, "w_ic", params_.Row(0));
1067  PrintParameterStats(stream, "w_fc", params_.Row(1));
1068  PrintParameterStats(stream, "w_oc", params_.Row(2));
1069 
1070  // Note: some of the following code mirrors the code in
1071  // UpdatableComponent::Info(), in nnet-component-itf.cc.
1072  if (count_ > 0) {
1073  stream << ", count=" << std::setprecision(3) << count_
1074  << std::setprecision(6);
1075  }
1076  static const char *nonlin_names[] = { "i_t_sigmoid", "f_t_sigmoid", "c_t_tanh",
1077  "o_t_sigmoid", "m_t_tanh" };
1078  for (int32 i = 0; i < 5; i++) {
1079  stream << ", " << nonlin_names[i] << "={";
1080  stream << " self-repair-lower-threshold=" << self_repair_config_(i)
1081  << ", self-repair-scale=" << self_repair_config_(i + 5);
1082 
1083  if (count_ != 0) {
1084  BaseFloat self_repaired_proportion =
1085  self_repair_total_(i) / (count_ * cell_dim);
1086  stream << ", self-repaired-proportion=" << self_repaired_proportion;
1087  Vector<double> value_sum(value_sum_.Row(i)),
1088  deriv_sum(deriv_sum_.Row(i));
1089  Vector<BaseFloat> value_avg(value_sum), deriv_avg(deriv_sum);
1090  value_avg.Scale(1.0 / count_);
1091  deriv_avg.Scale(1.0 / count_);
1092  stream << ", value-avg=" << SummarizeVector(value_avg)
1093  << ", deriv-avg=" << SummarizeVector(deriv_avg);
1094  }
1095  stream << " }";
1096  }
1097  return stream.str();
1098 }
const CuSubVector< Real > Row(MatrixIndexT i) const
Definition: cu-matrix.h:670
std::string SummarizeVector(const VectorBase< float > &vec)
Returns a string that summarizes a vector fairly succintly, for printing stats in info lines...
Definition: nnet-parse.cc:111
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
virtual std::string Info() const
Returns some text-form information about this component, for diagnostics.
void PrintParameterStats(std::ostringstream &os, const std::string &name, const CuVectorBase< BaseFloat > &params, bool include_mean)
Print to &#39;os&#39; some information about the mean and standard deviation of some parameters, used in Info() functions in nnet-simple-component.cc.
Definition: nnet-parse.cc:157

◆ Init()

void Init ( int32  cell_dim,
bool  use_dropout,
BaseFloat  param_stddev,
BaseFloat  tanh_self_repair_threshold,
BaseFloat  sigmoid_self_repair_threshold,
BaseFloat  self_repair_scale 
)

Definition at line 1242 of file nnet-combined-component.cc.

References LstmNonlinearityComponent::count_, LstmNonlinearityComponent::deriv_sum_, LstmNonlinearityComponent::InitNaturalGradient(), KALDI_ASSERT, LstmNonlinearityComponent::params_, CuVector< Real >::Resize(), CuMatrix< Real >::Resize(), LstmNonlinearityComponent::self_repair_config_, LstmNonlinearityComponent::self_repair_total_, LstmNonlinearityComponent::use_dropout_, and LstmNonlinearityComponent::value_sum_.

Referenced by LstmNonlinearityComponent::InitFromConfig().

1247  {
1248  KALDI_ASSERT(cell_dim > 0 && param_stddev >= 0.0 &&
1249  tanh_self_repair_threshold >= 0.0 &&
1250  tanh_self_repair_threshold <= 1.0 &&
1251  sigmoid_self_repair_threshold >= 0.0 &&
1252  sigmoid_self_repair_threshold <= 0.25 &&
1253  self_repair_scale >= 0.0 && self_repair_scale <= 0.1);
1254  use_dropout_ = use_dropout;
1255  params_.Resize(3, cell_dim);
1256  params_.SetRandn();
1257  params_.Scale(param_stddev);
1258  value_sum_.Resize(5, cell_dim);
1259  deriv_sum_.Resize(5, cell_dim);
1260  self_repair_config_.Resize(10);
1261  self_repair_config_.Range(0, 5).Set(sigmoid_self_repair_threshold);
1262  self_repair_config_(2) = tanh_self_repair_threshold;
1263  self_repair_config_(4) = tanh_self_repair_threshold;
1264  self_repair_config_.Range(5, 5).Set(self_repair_scale);
1266  count_ = 0.0;
1268 
1269 }
void Resize(MatrixIndexT dim, MatrixResizeType t=kSetZero)
Allocate the memory.
Definition: cu-vector.cc:993
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
void Resize(MatrixIndexT rows, MatrixIndexT cols, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Allocate the memory.
Definition: cu-matrix.cc:50

◆ InitFromConfig()

void InitFromConfig ( ConfigLine cfl)
virtual

Initialize, from a ConfigLine object.

Parameters
[in]cflA ConfigLine containing any parameters that are needed for initialization. For example: "dim=100 param-stddev=0.1"

Implements Component.

Definition at line 1290 of file nnet-combined-component.cc.

References ConfigLine::GetValue(), ConfigLine::HasUnusedValues(), LstmNonlinearityComponent::Init(), UpdatableComponent::InitLearningRatesFromConfig(), KALDI_ERR, LstmNonlinearityComponent::Type(), ConfigLine::UnusedValues(), and ConfigLine::WholeLine().

1290  {
1292  bool ok = true;
1293  bool use_dropout = false;
1294  int32 cell_dim;
1295  // these self-repair thresholds are the normal defaults for tanh and sigmoid
1296  // respectively. If, later on, we decide that we want to support different
1297  // self-repair config values for the individual sigmoid and tanh
1298  // nonlinearities, we can modify this code then.
1299  BaseFloat tanh_self_repair_threshold = 0.2,
1300  sigmoid_self_repair_threshold = 0.05,
1301  self_repair_scale = 1.0e-05;
1302  // param_stddev is the stddev of the parameters. it may be better to
1303  // use a smaller value but this was the default in the python scripts
1304  // for a while.
1305  BaseFloat param_stddev = 1.0;
1306  ok = ok && cfl->GetValue("cell-dim", &cell_dim);
1307  cfl->GetValue("param-stddev", &param_stddev);
1308  cfl->GetValue("tanh-self-repair-threshold",
1309  &tanh_self_repair_threshold);
1310  cfl->GetValue("sigmoid-self-repair-threshold",
1311  &sigmoid_self_repair_threshold);
1312  cfl->GetValue("self-repair-scale", &self_repair_scale);
1313  cfl->GetValue("use-dropout", &use_dropout);
1314 
1315  // We may later on want to make it possible to initialize the different
1316  // parameters w_ic, w_fc and w_oc with different biases. We'll implement
1317  // that when and if it's needed.
1318 
1319  if (cfl->HasUnusedValues())
1320  KALDI_ERR << "Could not process these elements in initializer: "
1321  << cfl->UnusedValues();
1322  if (ok) {
1323  Init(cell_dim, use_dropout, param_stddev, tanh_self_repair_threshold,
1324  sigmoid_self_repair_threshold, self_repair_scale);
1325  } else {
1326  KALDI_ERR << "Invalid initializer for layer of type "
1327  << Type() << ": \"" << cfl->WholeLine() << "\"";
1328  }
1329 }
virtual std::string Type() const
Returns a string such as "SigmoidComponent", describing the type of the object.
void InitLearningRatesFromConfig(ConfigLine *cfl)
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:147
void Init(int32 cell_dim, bool use_dropout, BaseFloat param_stddev, BaseFloat tanh_self_repair_threshold, BaseFloat sigmoid_self_repair_threshold, BaseFloat self_repair_scale)

◆ InitNaturalGradient()

void InitNaturalGradient ( )
private

Definition at line 1271 of file nnet-combined-component.cc.

References LstmNonlinearityComponent::preconditioner_, OnlineNaturalGradient::SetNumSamplesHistory(), OnlineNaturalGradient::SetRank(), and OnlineNaturalGradient::SetUpdatePeriod().

Referenced by LstmNonlinearityComponent::Init().

1271  {
1272  // As regards the configuration for the natural-gradient preconditioner, we
1273  // don't make it configurable from the command line-- it's unlikely that any
1274  // differences from changing this would be substantial enough to effectively
1275  // tune the configuration. Because the preconditioning code doesn't 'see' the
1276  // derivatives from individual frames, but only averages over the minibatch,
1277  // there is a fairly small amount of data available to estimate the Fisher
1278  // information matrix, so we set the rank, update period and
1279  // num-samples-history to smaller values than normal.
1283 }
void SetNumSamplesHistory(BaseFloat num_samples_history)

◆ InputDim()

int32 InputDim ( ) const
virtual

Returns input-dimension of this component.

Implements Component.

Definition at line 969 of file nnet-combined-component.cc.

Referenced by LstmNonlinearityComponent::ConsolidateMemory().

969  {
970  int32 cell_dim = value_sum_.NumCols();
971  return cell_dim * 5 + (use_dropout_ ? 3 : 0);
972 }
kaldi::int32 int32
MatrixIndexT NumCols() const
Definition: cu-matrix.h:216

◆ NumParameters()

int32 NumParameters ( ) const
virtual

The following new virtual function returns the total dimension of the parameters in this class.

Reimplemented from UpdatableComponent.

Definition at line 1154 of file nnet-combined-component.cc.

Referenced by LstmNonlinearityComponent::ConsolidateMemory().

1154  {
1155  return params_.NumRows() * params_.NumCols();
1156 }

◆ operator=()

const LstmNonlinearityComponent& operator= ( const LstmNonlinearityComponent other)
private

◆ OutputDim()

int32 OutputDim ( ) const
virtual

Returns output-dimension of this component.

Implements Component.

Definition at line 974 of file nnet-combined-component.cc.

Referenced by LstmNonlinearityComponent::ConsolidateMemory().

974  {
975  int32 cell_dim = value_sum_.NumCols();
976  return cell_dim * 2;
977 }
kaldi::int32 int32
MatrixIndexT NumCols() const
Definition: cu-matrix.h:216

◆ PerturbParams()

void PerturbParams ( BaseFloat  stddev)
virtual

This function is to be used in testing.

It adds unit noise times "stddev" to the parameters of the component.

Implements UpdatableComponent.

Definition at line 1140 of file nnet-combined-component.cc.

References CuMatrixBase< Real >::SetRandn().

1140  {
1141  CuMatrix<BaseFloat> temp_params(params_.NumRows(), params_.NumCols());
1142  temp_params.SetRandn();
1143  params_.AddMat(stddev, temp_params);
1144 }

◆ Propagate()

void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 1171 of file nnet-combined-component.cc.

References kaldi::cu::ComputeLstmNonlinearity().

1174  {
1176  return NULL;
1177 }
void ComputeLstmNonlinearity(const CuMatrixBase< Real > &input, const CuMatrixBase< Real > &params, CuMatrixBase< Real > *output)
this is a special-purpose function used by class LstmNonlinearityComponent, to do its forward propaga...
Definition: cu-math.cc:489

◆ Properties()

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)
virtual

Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed.

Implements Component.

Definition at line 980 of file nnet-combined-component.cc.

References kaldi::nnet3::ExpectToken(), KALDI_ASSERT, kaldi::ReadBasicType(), and kaldi::ReadToken().

980  {
981  ReadUpdatableCommon(is, binary); // Read opening tag and learning rate.
982  ExpectToken(is, binary, "<Params>");
983  params_.Read(is, binary);
984  ExpectToken(is, binary, "<ValueAvg>");
985  value_sum_.Read(is, binary);
986  ExpectToken(is, binary, "<DerivAvg>");
987  deriv_sum_.Read(is, binary);
988  ExpectToken(is, binary, "<SelfRepairConfig>");
989  self_repair_config_.Read(is, binary);
990  ExpectToken(is, binary, "<SelfRepairProb>");
991  self_repair_total_.Read(is, binary);
992 
993  std::string tok;
994  ReadToken(is, binary, &tok);
995  if (tok == "<UseDropout>") {
996  ReadBasicType(is, binary, &use_dropout_);
997  ReadToken(is, binary, &tok);
998  } else {
999  use_dropout_ = false;
1000  }
1001  KALDI_ASSERT(tok == "<Count>");
1002  ReadBasicType(is, binary, &count_);
1003 
1004  // For the on-disk format, we normalze value_sum_, deriv_sum_ and
1005  // self_repair_total_ by dividing by the count, but in memory they are scaled
1006  // by the count. [for self_repair_total_, the scaling factor is count_ *
1007  // cell_dim].
1010  int32 cell_dim = params_.NumCols();
1011  self_repair_total_.Scale(count_ * cell_dim);
1012 
1014 
1015  ExpectToken(is, binary, "</LstmNonlinearityComponent>");
1016 
1017 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
kaldi::int32 int32
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
void Scale(Real value)
Definition: cu-matrix.cc:644
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
std::string ReadUpdatableCommon(std::istream &is, bool binary)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
void Read(std::istream &is, bool binary)
I/O functions.
Definition: cu-matrix.cc:494
void Read(std::istream &is, bool binary)
I/O.
Definition: cu-vector.cc:963
void Scale(Real value)
Definition: cu-vector.cc:1216

◆ Scale()

void Scale ( BaseFloat  scale)
virtual

This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent.

– a Nonlinear component (or another component that stores stats, like BatchNormComponent)– it relates to scaling activation stats, not parameters. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 1112 of file nnet-combined-component.cc.

1112  {
1113  if (scale == 0.0) {
1114  params_.SetZero();
1115  value_sum_.SetZero();
1116  deriv_sum_.SetZero();
1118  count_ = 0.0;
1119  } else {
1120  params_.Scale(scale);
1121  value_sum_.Scale(scale);
1122  deriv_sum_.Scale(scale);
1123  self_repair_total_.Scale(scale);
1124  count_ *= scale;
1125  }
1126 }
void SetZero()
Math operations.
Definition: cu-vector.cc:1098
void Scale(Real value)
Definition: cu-matrix.cc:644
void SetZero()
Math operations, some calling kernels.
Definition: cu-matrix.cc:509
void Scale(Real value)
Definition: cu-vector.cc:1216

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 343 of file nnet-combined-component.h.

Referenced by LstmNonlinearityComponent::InitFromConfig().

343 { return "LstmNonlinearityComponent"; }

◆ UnVectorize()

void UnVectorize ( const VectorBase< BaseFloat > &  params)
virtual

Converts the parameters from vector form.

Reimplemented from UpdatableComponent.

Definition at line 1164 of file nnet-combined-component.cc.

References VectorBase< Real >::Dim(), KALDI_ASSERT, and kaldi::nnet3::NumParameters().

1165  {
1166  KALDI_ASSERT(params.Dim() == NumParameters());
1167  params_.CopyRowsFromVec(params);
1168 }
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Vectorize()

void Vectorize ( VectorBase< BaseFloat > *  params) const
virtual

Turns the parameters into vector form.

We put the vector form on the CPU, because in the kinds of situations where we do this, we'll tend to use too much memory for the GPU.

Reimplemented from UpdatableComponent.

Definition at line 1158 of file nnet-combined-component.cc.

References VectorBase< Real >::CopyRowsFromMat(), VectorBase< Real >::Dim(), KALDI_ASSERT, and kaldi::nnet3::NumParameters().

1158  {
1159  KALDI_ASSERT(params->Dim() == NumParameters());
1160  params->CopyRowsFromMat(params_);
1161 }
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Implements Component.

Definition at line 1019 of file nnet-combined-component.cc.

References MatrixBase< Real >::Scale(), VectorBase< Real >::Scale(), VectorBase< Real >::Write(), MatrixBase< Real >::Write(), kaldi::WriteBasicType(), and kaldi::WriteToken().

1019  {
1020  WriteUpdatableCommon(os, binary); // Read opening tag and learning rate.
1021 
1022  WriteToken(os, binary, "<Params>");
1023  params_.Write(os, binary);
1024  WriteToken(os, binary, "<ValueAvg>");
1025  {
1026  Matrix<BaseFloat> value_avg(value_sum_);
1027  if (count_ != 0.0)
1028  value_avg.Scale(1.0 / count_);
1029  value_avg.Write(os, binary);
1030  }
1031  WriteToken(os, binary, "<DerivAvg>");
1032  {
1033  Matrix<BaseFloat> deriv_avg(deriv_sum_);
1034  if (count_ != 0.0)
1035  deriv_avg.Scale(1.0 / count_);
1036  deriv_avg.Write(os, binary);
1037  }
1038  WriteToken(os, binary, "<SelfRepairConfig>");
1039  self_repair_config_.Write(os, binary);
1040  WriteToken(os, binary, "<SelfRepairProb>");
1041  {
1042  int32 cell_dim = params_.NumCols();
1043  Vector<BaseFloat> self_repair_prob(self_repair_total_);
1044  if (count_ != 0.0)
1045  self_repair_prob.Scale(1.0 / (count_ * cell_dim));
1046  self_repair_prob.Write(os, binary);
1047  }
1048  if (use_dropout_) {
1049  // only write this if true; we have back-compat code in reading anyway.
1050  // this makes the models without dropout easier to read with older code.
1051  WriteToken(os, binary, "<UseDropout>");
1052  WriteBasicType(os, binary, use_dropout_);
1053  }
1054  WriteToken(os, binary, "<Count>");
1055  WriteBasicType(os, binary, count_);
1056  WriteToken(os, binary, "</LstmNonlinearityComponent>");
1057 }
kaldi::int32 int32
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteUpdatableCommon(std::ostream &is, bool binary) const
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

◆ ZeroStats()

void ZeroStats ( )
virtual

Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero.

Other components that store other types of statistics (e.g. regarding gradient clipping) should implement ZeroStats() also.

Reimplemented from Component.

Definition at line 1105 of file nnet-combined-component.cc.

1105  {
1106  value_sum_.SetZero();
1107  deriv_sum_.SetZero();
1109  count_ = 0.0;
1110 }
void SetZero()
Math operations.
Definition: cu-vector.cc:1098
void SetZero()
Math operations, some calling kernels.
Definition: cu-matrix.cc:509

Member Data Documentation

◆ count_

◆ deriv_sum_

◆ params_

◆ preconditioner_

◆ self_repair_config_

CuVector<BaseFloat> self_repair_config_
private

Definition at line 422 of file nnet-combined-component.h.

Referenced by LstmNonlinearityComponent::Init().

◆ self_repair_total_

◆ use_dropout_

bool use_dropout_
private

Definition at line 402 of file nnet-combined-component.h.

Referenced by LstmNonlinearityComponent::Init().

◆ value_sum_


The documentation for this class was generated from the following files: