TanhComponent Class Reference

#include <nnet-simple-component.h>

Inheritance diagram for TanhComponent:
Collaboration diagram for TanhComponent:

Public Member Functions

 TanhComponent (const TanhComponent &other)
 
 TanhComponent ()
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
- Public Member Functions inherited from NonlinearComponent
 NonlinearComponent ()
 
 NonlinearComponent (const NonlinearComponent &other)
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
virtual void Read (std::istream &is, bool binary)
 We implement Read at this level as it just needs the Type(). More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
const CuVector< double > & ValueSum () const
 
const CuVector< double > & DerivSum () const
 
double Count () const
 
- Public Member Functions inherited from Component
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 

Private Member Functions

void RepairGradients (const CuMatrixBase< BaseFloat > &out_value, CuMatrixBase< BaseFloat > *in_deriv, TanhComponent *to_update) const
 
TanhComponentoperator= (const TanhComponent &other)
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Types inherited from NonlinearComponent
enum  { kUnsetThreshold = -1000 }
 
- Protected Member Functions inherited from NonlinearComponent
void StoreStatsInternal (const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > *deriv=NULL)
 
void StoreBackpropStats (const CuMatrixBase< BaseFloat > &out_deriv)
 
const NonlinearComponentoperator= (const NonlinearComponent &other)
 
- Protected Attributes inherited from NonlinearComponent
int32 dim_
 
int32 block_dim_
 
CuVector< double > value_sum_
 
CuVector< double > deriv_sum_
 
double count_
 
CuVector< double > oderiv_sumsq_
 
double oderiv_count_
 
double num_dims_self_repaired_
 
double num_dims_processed_
 
BaseFloat self_repair_lower_threshold_
 
BaseFloat self_repair_upper_threshold_
 
BaseFloat self_repair_scale_
 

Detailed Description

Definition at line 282 of file nnet-simple-component.h.

Constructor & Destructor Documentation

◆ TanhComponent() [1/2]

TanhComponent ( const TanhComponent other)
inlineexplicit

Definition at line 284 of file nnet-simple-component.h.

◆ TanhComponent() [2/2]

TanhComponent ( )
inline

Definition at line 285 of file nnet-simple-component.h.

285 { }

Member Function Documentation

◆ Backprop()

void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 921 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::DiffTanh(), NVTX_RANGE, and NonlinearComponent::StoreBackpropStats().

929  {
930  NVTX_RANGE("TanhComponent::Backprop");
931  if (in_deriv != NULL) {
932  in_deriv->DiffTanh(out_value, out_deriv);
933  TanhComponent *to_update = dynamic_cast<TanhComponent*>(to_update_in);
934  if (to_update != NULL) {
935  RepairGradients(out_value, in_deriv, to_update);
936  to_update->StoreBackpropStats(out_deriv);
937  }
938  }
939 }
#define NVTX_RANGE(name)
Definition: cu-common.h:143
void RepairGradients(const CuMatrixBase< BaseFloat > &out_value, CuMatrixBase< BaseFloat > *in_deriv, TanhComponent *to_update) const

◆ Copy()

virtual Component* Copy ( ) const
inlinevirtual

Copies component (deep copy).

Implements Component.

Definition at line 287 of file nnet-simple-component.h.

287 { return new TanhComponent(*this); }

◆ operator=()

TanhComponent& operator= ( const TanhComponent other)
private

◆ Propagate()

void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 844 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::Tanh().

846  {
847  // Apply tanh function to each element of the output...
848  // the tanh function may be written as -1 + ( 2 / (1 + e^{-2 x})),
849  // which is a scaled and shifted sigmoid.
850  out->Tanh(in);
851  return NULL;
852 }

◆ Properties()

◆ RepairGradients()

void RepairGradients ( const CuMatrixBase< BaseFloat > &  out_value,
CuMatrixBase< BaseFloat > *  in_deriv,
TanhComponent to_update 
) const
private

Definition at line 855 of file nnet-simple-component.cc.

References CuVectorBase< Real >::Add(), CuMatrixBase< Real >::AddMatDiagVec(), CuVectorBase< Real >::AddVec(), CuMatrixBase< Real >::ApplyHeaviside(), DropoutComponent::dim_, KALDI_ASSERT, KALDI_ERR, kaldi::kNoTrans, NonlinearComponent::num_dims_processed_, NonlinearComponent::num_dims_self_repaired_, kaldi::RandUniform(), and CuVectorBase< Real >::Sum().

858  {
859  KALDI_ASSERT(to_update != NULL);
860  // maximum possible derivative of SigmoidComponent is 1.0
861  // the default lower-threshold on the derivative, below which we
862  // add a term to the derivative to encourage the inputs to the sigmoid
863  // to be closer to zero, is 0.2, which means the derivative is on average
864  // 5 times smaller than its maximum possible value.
865  BaseFloat default_lower_threshold = 0.2;
866 
867  // we use this 'repair_probability' (hardcoded for now) to limit
868  // this code to running on about half of the minibatches.
869  BaseFloat repair_probability = 0.5;
870 
871  to_update->num_dims_processed_ += dim_;
872 
873  if (self_repair_scale_ == 0.0 || count_ == 0.0 || deriv_sum_.Dim() != dim_ ||
874  RandUniform() > repair_probability)
875  return;
876 
877  // check that the self-repair scale is in a reasonable range.
879  BaseFloat unset = kUnsetThreshold; // -1000.0
880  BaseFloat lower_threshold = (self_repair_lower_threshold_ == unset ?
881  default_lower_threshold :
883  count_;
884  if (self_repair_upper_threshold_ != unset) {
885  KALDI_ERR << "Do not set the self-repair-upper-threshold for sigmoid "
886  << "components, it does nothing.";
887  }
888 
889  // thresholds_vec is actually a 1-row matrix. (the ApplyHeaviside
890  // function isn't defined for vectors).
891  CuMatrix<BaseFloat> thresholds(1, dim_);
892  CuSubVector<BaseFloat> thresholds_vec(thresholds, 0);
893  thresholds_vec.AddVec(-1.0, deriv_sum_);
894  thresholds_vec.Add(lower_threshold);
895  thresholds.ApplyHeaviside();
896  to_update->num_dims_self_repaired_ += thresholds_vec.Sum();
897 
898  // At this point, 'thresholds_vec' contains a 1 for each dimension of
899  // the output that is 'problematic', i.e. for which the avg-deriv
900  // is less than the self-repair lower threshold, and a 0 for
901  // each dimension that is not problematic.
902 
903  // what we want to do is to add -self_repair_scale_ / repair_probability times
904  // output-valiue) to the input derivative for each problematic dimension.
905  // note that for the tanh, the output-value goes from -1.0 when the input is
906  // -inf to +1.0 when the input is +inf. The negative sign is so that for
907  // inputs <0, we push them up towards 0, and for inputs >0, we push them down
908  // towards 0. Our use of the tanh here is just a convenience since we have it
909  // available. We could use just about any function that is positive for
910  // inputs < 0 and negative for inputs > 0.
911 
912  // We can rearrange the above as: for only the problematic columns,
913  // input-deriv -= self-repair-scale / repair-probabilty * output
914  // which we can write as:
915  // input-deriv -= self-repair-scale / repair-probabilty * output * thresholds-vec
916 
917  in_deriv->AddMatDiagVec(-self_repair_scale_ / repair_probability,
918  out_value, kNoTrans, thresholds_vec);
919 }
float RandUniform(struct RandomState *state=NULL)
Returns a random number strictly between 0 and 1.
Definition: kaldi-math.h:151
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:69

◆ StoreStats()

void StoreStats ( const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
void *  memo 
)
virtual

This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity.

It only does something for those components that have nonzero Properties()&kStoresStats.

Parameters
[in]in_valueThe input to the Propagate() function. Note: if the component sets the flag kPropagateInPlace, this should not be used; the empty matrix will be provided here if in-place propagation was used.
[in]out_valueThe output of the Propagate() function.
[in]memoThe 'memo' returned by the Propagate() function; this will usually be NULL.

Reimplemented from Component.

Definition at line 948 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::Add(), CuMatrixBase< Real >::ApplyPow(), kaldi::RandInt(), and CuMatrixBase< Real >::Scale().

950  {
951  // Only store stats about every other minibatch (but on the first minibatch,
952  // always store it, which is necessary for the ConsolidateMemory() operation
953  // to work correctly.
954  if (RandInt(0, 1) == 0 && count_ != 0)
955  return;
956  // derivative of the onlinearity is out_value * (1.0 - out_value);
957  CuMatrix<BaseFloat> temp_deriv(out_value);
958  temp_deriv.ApplyPow(2.0);
959  temp_deriv.Scale(-1.0);
960  temp_deriv.Add(1.0);
961  StoreStatsInternal(out_value, &temp_deriv);
962 }
void StoreStatsInternal(const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > *deriv=NULL)
int32 RandInt(int32 min_val, int32 max_val, struct RandomState *state)
Definition: kaldi-math.cc:95

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 286 of file nnet-simple-component.h.

286 { return "TanhComponent"; }

The documentation for this class was generated from the following files: