SigmoidComponent Class Reference

#include <nnet-simple-component.h>

Inheritance diagram for SigmoidComponent:
Collaboration diagram for SigmoidComponent:

Public Member Functions

 SigmoidComponent (const SigmoidComponent &other)
 
 SigmoidComponent ()
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
- Public Member Functions inherited from NonlinearComponent
 NonlinearComponent ()
 
 NonlinearComponent (const NonlinearComponent &other)
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
virtual void Read (std::istream &is, bool binary)
 We implement Read at this level as it just needs the Type(). More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
const CuVector< double > & ValueSum () const
 
const CuVector< double > & DerivSum () const
 
double Count () const
 
- Public Member Functions inherited from Component
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 

Private Member Functions

void RepairGradients (const CuMatrixBase< BaseFloat > &out_value, CuMatrixBase< BaseFloat > *in_deriv, SigmoidComponent *to_update) const
 
SigmoidComponentoperator= (const SigmoidComponent &other)
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Types inherited from NonlinearComponent
enum  { kUnsetThreshold = -1000 }
 
- Protected Member Functions inherited from NonlinearComponent
void StoreStatsInternal (const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > *deriv=NULL)
 
void StoreBackpropStats (const CuMatrixBase< BaseFloat > &out_deriv)
 
const NonlinearComponentoperator= (const NonlinearComponent &other)
 
- Protected Attributes inherited from NonlinearComponent
int32 dim_
 
int32 block_dim_
 
CuVector< double > value_sum_
 
CuVector< double > deriv_sum_
 
double count_
 
CuVector< double > oderiv_sumsq_
 
double oderiv_count_
 
double num_dims_self_repaired_
 
double num_dims_processed_
 
BaseFloat self_repair_lower_threshold_
 
BaseFloat self_repair_upper_threshold_
 
BaseFloat self_repair_scale_
 

Detailed Description

Definition at line 222 of file nnet-simple-component.h.

Constructor & Destructor Documentation

◆ SigmoidComponent() [1/2]

SigmoidComponent ( const SigmoidComponent other)
inlineexplicit

Definition at line 224 of file nnet-simple-component.h.

◆ SigmoidComponent() [2/2]

SigmoidComponent ( )
inline

Definition at line 225 of file nnet-simple-component.h.

225 { }

Member Function Documentation

◆ Backprop()

void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 328 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::DiffSigmoid(), NVTX_RANGE, and NonlinearComponent::StoreBackpropStats().

335  {
336  NVTX_RANGE("SigmoidComponent::Backprop");
337  if (in_deriv != NULL) {
338  in_deriv->DiffSigmoid(out_value, out_deriv);
339  SigmoidComponent *to_update = dynamic_cast<SigmoidComponent*>(to_update_in);
340  if (to_update != NULL) {
341  RepairGradients(out_value, in_deriv, to_update);
342  to_update->StoreBackpropStats(out_deriv);
343  }
344  }
345 }
void RepairGradients(const CuMatrixBase< BaseFloat > &out_value, CuMatrixBase< BaseFloat > *in_deriv, SigmoidComponent *to_update) const
#define NVTX_RANGE(name)
Definition: cu-common.h:143

◆ Copy()

virtual Component* Copy ( ) const
inlinevirtual

Copies component (deep copy).

Implements Component.

Definition at line 230 of file nnet-simple-component.h.

References PnormComponent::Backprop(), PnormComponent::Propagate(), and Component::StoreStats().

◆ operator=()

SigmoidComponent& operator= ( const SigmoidComponent other)
private

◆ Propagate()

void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 321 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::Sigmoid().

323  {
324  out->Sigmoid(in);
325  return NULL;
326 }

◆ Properties()

virtual int32 Properties ( ) const
inlinevirtual

◆ RepairGradients()

void RepairGradients ( const CuMatrixBase< BaseFloat > &  out_value,
CuMatrixBase< BaseFloat > *  in_deriv,
SigmoidComponent to_update 
) const
private

Definition at line 347 of file nnet-simple-component.cc.

References CuVectorBase< Real >::Add(), CuMatrixBase< Real >::AddMatDiagVec(), CuVectorBase< Real >::AddVec(), CuMatrixBase< Real >::AddVecToRows(), CuMatrixBase< Real >::ApplyHeaviside(), DropoutComponent::dim_, KALDI_ASSERT, KALDI_ERR, kaldi::kNoTrans, NonlinearComponent::num_dims_processed_, NonlinearComponent::num_dims_self_repaired_, kaldi::RandUniform(), and CuVectorBase< Real >::Sum().

350  {
351  KALDI_ASSERT(to_update != NULL);
352  // maximum possible derivative of SigmoidComponent is 0.25.
353  // the default lower-threshold on the derivative, below which we
354  // add a term to the derivative to encourage the inputs to the sigmoid
355  // to be closer to zero, is 0.05, which means the derivative is on average
356  // 5 times smaller than its maximum possible value.
357  BaseFloat default_lower_threshold = 0.05;
358 
359  // we use this 'repair_probability' (hardcoded for now) to limit
360  // this code to running on about half of the minibatches.
361  BaseFloat repair_probability = 0.5;
362 
363  to_update->num_dims_processed_ += dim_;
364 
365  if (self_repair_scale_ == 0.0 || count_ == 0.0 || deriv_sum_.Dim() != dim_ ||
366  RandUniform() > repair_probability)
367  return;
368 
369  // check that the self-repair scale is in a reasonable range.
371  BaseFloat unset = kUnsetThreshold; // -1000.0
372  BaseFloat lower_threshold = (self_repair_lower_threshold_ == unset ?
373  default_lower_threshold :
375  count_;
376  if (self_repair_upper_threshold_ != unset) {
377  KALDI_ERR << "Do not set the self-repair-upper-threshold for sigmoid "
378  << "components, it does nothing.";
379  }
380 
381  // thresholds_vec is actually a 1-row matrix. (the ApplyHeaviside
382  // function isn't defined for vectors).
383  CuMatrix<BaseFloat> thresholds(1, dim_);
384  CuSubVector<BaseFloat> thresholds_vec(thresholds, 0);
385  thresholds_vec.AddVec(-1.0, deriv_sum_);
386  thresholds_vec.Add(lower_threshold);
387  thresholds.ApplyHeaviside();
388  to_update->num_dims_self_repaired_ += thresholds_vec.Sum();
389 
390  // At this point, 'thresholds_vec' contains a 1 for each dimension of
391  // the output that is 'problematic', i.e. for which the avg-deriv
392  // is less than the self-repair lower threshold, and a 0 for
393  // each dimension that is not problematic.
394 
395  // what we want to do is to add
396  // -self_repair_scale_ / repair_probability times (2 * output-valiue - 1.0)
397  // to the input derivative for each problematic dimension.
398 
399  // Here, 2 * output - 1.0 is a version of the sigmoid that goes from -1.0 to
400  // 1.0, like a tanh. the negative sign is so that for inputs <0, we push them
401  // up towards 0, and for inputs >0, we push them down towards 0.
402  // Our use of this sigmoid-type function here is just a convenience since
403  // we have it available. We could use just about any function that is positive
404  // for inputs < 0 and negative for inputs > 0.
405 
406  // We can rearrange the above as: for only the problematic columns,
407  // input-deriv -= 2 * self-repair-scale / repair-probabilty * output
408  // input-deriv += self-repair-scale / repair-probabilty
409  // which we can write as:
410  // input-deriv -= 2 * self-repair-scale / repair-probabilty * output * thresholds-vec
411  // input-deriv += self-repair-scale / repair-probabilty * thresholds-vec
412 
413  in_deriv->AddMatDiagVec(-2.0 * self_repair_scale_ / repair_probability,
414  out_value, kNoTrans, thresholds_vec);
415  in_deriv->AddVecToRows(self_repair_scale_ / repair_probability,
416  thresholds_vec);
417 }
float RandUniform(struct RandomState *state=NULL)
Returns a random number strictly between 0 and 1.
Definition: kaldi-math.h:151
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:69

◆ StoreStats()

void StoreStats ( const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
void *  memo 
)
virtual

This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity.

It only does something for those components that have nonzero Properties()&kStoresStats.

Parameters
[in]in_valueThe input to the Propagate() function. Note: if the component sets the flag kPropagateInPlace, this should not be used; the empty matrix will be provided here if in-place propagation was used.
[in]out_valueThe output of the Propagate() function.
[in]memoThe 'memo' returned by the Propagate() function; this will usually be NULL.

Reimplemented from Component.

Definition at line 421 of file nnet-simple-component.cc.

References kaldi::kUndefined, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), kaldi::RandInt(), and CuMatrixBase< Real >::Set().

423  {
424  // Only store stats about every other minibatch (but on the first minibatch,
425  // always store it, which is necessary for the ConsolidateMemory() operation
426  // to work correctly.
427  if (RandInt(0, 1) == 0 && count_ != 0)
428  return;
429  // derivative of the nonlinearity is out_value * (1.0 - out_value);
430  CuMatrix<BaseFloat> temp_deriv(out_value.NumRows(), out_value.NumCols(),
431  kUndefined);
432  temp_deriv.Set(1.0);
433  temp_deriv.AddMat(-1.0, out_value);
434  temp_deriv.MulElements(out_value);
435  StoreStatsInternal(out_value, &temp_deriv);
436 }
void StoreStatsInternal(const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > *deriv=NULL)
int32 RandInt(int32 min_val, int32 max_val, struct RandomState *state)
Definition: kaldi-math.cc:95

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 226 of file nnet-simple-component.h.

226 { return "SigmoidComponent"; }

The documentation for this class was generated from the following files: