RectifiedLinearComponent Class Reference

#include <nnet-simple-component.h>

Inheritance diagram for RectifiedLinearComponent:
Collaboration diagram for RectifiedLinearComponent:

Public Member Functions

 RectifiedLinearComponent (const RectifiedLinearComponent &other)
 
 RectifiedLinearComponent ()
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
- Public Member Functions inherited from NonlinearComponent
 NonlinearComponent ()
 
 NonlinearComponent (const NonlinearComponent &other)
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
virtual void Read (std::istream &is, bool binary)
 We implement Read at this level as it just needs the Type(). More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
const CuVector< double > & ValueSum () const
 
const CuVector< double > & DerivSum () const
 
double Count () const
 
- Public Member Functions inherited from Component
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 

Private Member Functions

void RepairGradients (CuMatrixBase< BaseFloat > *in_deriv, RectifiedLinearComponent *to_update) const
 
RectifiedLinearComponentoperator= (const RectifiedLinearComponent &other)
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Types inherited from NonlinearComponent
enum  { kUnsetThreshold = -1000 }
 
- Protected Member Functions inherited from NonlinearComponent
void StoreStatsInternal (const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > *deriv=NULL)
 
void StoreBackpropStats (const CuMatrixBase< BaseFloat > &out_deriv)
 
const NonlinearComponentoperator= (const NonlinearComponent &other)
 
- Protected Attributes inherited from NonlinearComponent
int32 dim_
 
int32 block_dim_
 
CuVector< double > value_sum_
 
CuVector< double > deriv_sum_
 
double count_
 
CuVector< double > oderiv_sumsq_
 
double oderiv_count_
 
double num_dims_self_repaired_
 
double num_dims_processed_
 
BaseFloat self_repair_lower_threshold_
 
BaseFloat self_repair_upper_threshold_
 
BaseFloat self_repair_scale_
 

Detailed Description

Definition at line 344 of file nnet-simple-component.h.

Constructor & Destructor Documentation

◆ RectifiedLinearComponent() [1/2]

RectifiedLinearComponent ( const RectifiedLinearComponent other)
inlineexplicit

Definition at line 346 of file nnet-simple-component.h.

◆ RectifiedLinearComponent() [2/2]

Definition at line 348 of file nnet-simple-component.h.

348 { }

Member Function Documentation

◆ Backprop()

void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 974 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::Heaviside(), CuMatrixBase< Real >::MulElements(), NVTX_RANGE, and NonlinearComponent::StoreBackpropStats().

982  {
983  NVTX_RANGE("RectifiedLinearComponent::Backprop");
984  if (in_deriv != NULL) {
985  in_deriv->Heaviside(out_value);
986  in_deriv->MulElements(out_deriv);
987  RectifiedLinearComponent *to_update =
988  dynamic_cast<RectifiedLinearComponent*>(to_update_in);
989  if (to_update != NULL) {
990  RepairGradients(in_deriv, to_update);
991  to_update->StoreBackpropStats(out_deriv);
992  }
993  }
994 }
#define NVTX_RANGE(name)
Definition: cu-common.h:143
void RepairGradients(CuMatrixBase< BaseFloat > *in_deriv, RectifiedLinearComponent *to_update) const

◆ Copy()

virtual Component* Copy ( ) const
inlinevirtual

Copies component (deep copy).

Implements Component.

Definition at line 350 of file nnet-simple-component.h.

◆ operator=()

RectifiedLinearComponent& operator= ( const RectifiedLinearComponent other)
private

◆ Propagate()

void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 964 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::ApplyFloor(), and CuMatrixBase< Real >::CopyFromMat().

967  {
968  // Apply rectified linear function (x >= 0 ? 1.0 : 0.0)
969  out->CopyFromMat(in);
970  out->ApplyFloor(0.0);
971  return NULL;
972 }

◆ Properties()

◆ RepairGradients()

void RepairGradients ( CuMatrixBase< BaseFloat > *  in_deriv,
RectifiedLinearComponent to_update 
) const
private

Definition at line 997 of file nnet-simple-component.cc.

References CuVectorBase< Real >::Add(), CuVectorBase< Real >::AddVec(), CuMatrixBase< Real >::AddVecToRows(), CuVectorBase< Real >::ApplyPow(), CuVectorBase< Real >::CopyFromVec(), count, CuMatrixBase< Real >::Data(), DropoutComponent::dim_, KALDI_ASSERT, kaldi::kUndefined, NonlinearComponent::num_dims_processed_, NonlinearComponent::num_dims_self_repaired_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), kaldi::RandUniform(), CuMatrixBase< Real >::RowData(), CuVectorBase< Real >::Scale(), CuMatrixBase< Real >::Stride(), and CuVectorBase< Real >::Sum().

999  {
1000  KALDI_ASSERT(to_update != NULL);
1001  int32 dim = dim_, block_dim = block_dim_;
1002  BaseFloat default_lower_threshold = 0.05,
1003  default_upper_threshold = 0.95;
1004  // we use this 'repair_probability' (hardcoded for now) to limit
1005  // this code to running on about half of the minibatches.
1006  BaseFloat repair_probability = 0.5;
1007  KALDI_ASSERT(in_deriv->NumCols() == dim || in_deriv->NumCols() == block_dim);
1008  if (self_repair_scale_ == 0.0 || count_ == 0.0 ||
1009  deriv_sum_.Dim() != dim)
1010  return;
1011 
1012  if (in_deriv->NumCols() != block_dim) {
1013  KALDI_ASSERT(in_deriv->NumCols() == in_deriv->Stride());
1014  int32 dim_multiple = dim / block_dim;
1015  CuSubMatrix<BaseFloat> in_deriv_reshaped(in_deriv->Data(),
1016  in_deriv->NumRows() * dim_multiple,
1017  block_dim, block_dim);
1018  RepairGradients(&in_deriv_reshaped, to_update);
1019  return;
1020  }
1021 
1022  // By now we know that in_deriv->NumCols() == block_dim.
1023 
1024  if (RandUniform() > repair_probability)
1025  return;
1026 
1027  to_update->num_dims_processed_ += block_dim;
1028 
1029  // check that the self-repair scale is in a reasonable range.
1031  BaseFloat unset = kUnsetThreshold; // -1000.0
1033  lower_threshold = (self_repair_lower_threshold_ == unset ?
1034  default_lower_threshold :
1036  upper_threshold = (self_repair_upper_threshold_ == unset ?
1037  default_upper_threshold :
1039 
1040  CuMatrix<BaseFloat> storage(2, block_dim + 2, kUndefined);
1041  CuSubVector<BaseFloat> thresholds_vec(storage.RowData(0) + block_dim, 2);
1042  CuSubMatrix<BaseFloat> stats_mat(storage, 0, 2, 0, block_dim);
1043  thresholds_vec(0) = -lower_threshold;
1044  thresholds_vec(1) = -upper_threshold;
1045  CuSubVector<BaseFloat> row0(stats_mat, 0);
1046  CuSubVector<BaseFloat> row1(stats_mat, 1);
1047 
1048  if (block_dim == dim) {
1049  row0.CopyFromVec(deriv_sum_);
1050  } else {
1051  CuSubMatrix<double> deriv_sum_mat(deriv_sum_.Data(),
1052  dim / block_dim,
1053  block_dim, block_dim);
1054  CuVector<double> deriv_sum_dbl(block_dim);
1055  // get the average of the deriv-sums over the blocks.
1056  deriv_sum_dbl.AddRowSumMat(block_dim * 1.0 / dim, deriv_sum_mat);
1057  row0.CopyFromVec(deriv_sum_dbl);
1058  }
1059  row1.CopyFromVec(row0);
1060  stats_mat.AddVecToCols(1.0, thresholds_vec, 1.0);
1061  // now row0 equals stats - lower_threshold, and
1062  // row1 equals stats - upper_threshold.
1063  stats_mat.ApplyHeaviside();
1064  // now row0 equals (stats > lower_threshold ? 1 : 0), and
1065  // row1 equals (stats > upper_threshold ? 1 : 0).
1066  // what we want is:
1067  // self_repair_scale * ((stats <= lower_threshold ? 1 : 0) +
1068  // (stats > upper_threshold ? -1 : 0)).
1069  //
1070  // we can get these in stats_mat.Row(0) by computing:
1071  // -self_repair_scale * (stats_mat.Row(1) + stats_mat.Row(0) - 1).
1072  row0.AddVec(1.0, row1, 1.0);
1073  row0.Add(-1.0);
1074  CuVector<BaseFloat> temp(row0);
1075  temp.ApplyPow(2.0);
1076  to_update->num_dims_self_repaired_ += temp.Sum();
1077  // [actually we need to divide by repair_probability also, to
1078  // correct for the fact that we only do this on some frames.]
1079  row0.Scale(-self_repair_scale_ / repair_probability);
1080  in_deriv->AddVecToRows(1.0, row0, 1.0);
1081 }
float RandUniform(struct RandomState *state=NULL)
Returns a random number strictly between 0 and 1.
Definition: kaldi-math.h:151
kaldi::int32 int32
const size_t count
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
Real * Data()
Returns a pointer to the start of the vector&#39;s data.
Definition: cu-vector.h:72
void RepairGradients(CuMatrixBase< BaseFloat > *in_deriv, RectifiedLinearComponent *to_update) const
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:69

◆ StoreStats()

void StoreStats ( const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
void *  memo 
)
virtual

This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity.

It only does something for those components that have nonzero Properties()&kStoresStats.

Parameters
[in]in_valueThe input to the Propagate() function. Note: if the component sets the flag kPropagateInPlace, this should not be used; the empty matrix will be provided here if in-place propagation was used.
[in]out_valueThe output of the Propagate() function.
[in]memoThe 'memo' returned by the Propagate() function; this will usually be NULL.

Reimplemented from Component.

Definition at line 1084 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::Heaviside(), kaldi::kUndefined, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), and kaldi::RandInt().

1087  {
1088  // Only store stats about every other minibatch (but on the first minibatch,
1089  // always store it, which is necessary for the ConsolidateMemory() operation
1090  // to work correctly.
1091  if (RandInt(0, 1) == 0 && count_ != 0)
1092  return;
1093  CuMatrix<BaseFloat> temp_deriv(out_value.NumRows(),
1094  out_value.NumCols(),
1095  kUndefined);
1096  temp_deriv.Heaviside(out_value);
1097  StoreStatsInternal(out_value, &temp_deriv);
1098 }
void StoreStatsInternal(const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > *deriv=NULL)
int32 RandInt(int32 min_val, int32 max_val, struct RandomState *state)
Definition: kaldi-math.cc:95

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 349 of file nnet-simple-component.h.

349 { return "RectifiedLinearComponent"; }

The documentation for this class was generated from the following files: