All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
ClipGradientComponent Class Reference

#include <nnet-simple-component.h>

Inheritance diagram for ClipGradientComponent:
Collaboration diagram for ClipGradientComponent:

Public Member Functions

 ClipGradientComponent (int32 dim, BaseFloat clipping_threshold, bool norm_based_clipping, BaseFloat self_repair_clipped_proportion_threshold, BaseFloat self_repair_target, BaseFloat self_repair_scale, int32 num_clipped, int32 count, int32 num_self_repaired, int32 num_backpropped)
 
 ClipGradientComponent ()
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
void Init (int32 dim, BaseFloat clipping_threshold, bool norm_based_clipping, BaseFloat self_repair_clipped_proportion_threshold, BaseFloat self_repair_target, BaseFloat self_repair_scale, int32 num_clipped, int32 count, int32 num_self_repaired, int32 num_backpropped)
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual ~ClipGradientComponent ()
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual
ComponentPrecomputedIndexes
PrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 

Protected Attributes

int32 num_clipped_
 
int32 count_
 
int32 num_self_repaired_
 
int32 num_backpropped_
 

Private Member Functions

void RepairGradients (const std::string &debug_info, const CuMatrixBase< BaseFloat > &in_value, CuMatrixBase< BaseFloat > *in_deriv, ClipGradientComponent *to_update) const
 
ClipGradientComponentoperator= (const ClipGradientComponent &other)
 

Private Attributes

int32 dim_
 
BaseFloat clipping_threshold_
 
bool norm_based_clipping_
 
BaseFloat self_repair_clipped_proportion_threshold_
 
BaseFloat self_repair_target_
 
BaseFloat self_repair_scale_
 
std::string debug_info_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 

Detailed Description

Definition at line 1249 of file nnet-simple-component.h.

Constructor & Destructor Documentation

ClipGradientComponent ( int32  dim,
BaseFloat  clipping_threshold,
bool  norm_based_clipping,
BaseFloat  self_repair_clipped_proportion_threshold,
BaseFloat  self_repair_target,
BaseFloat  self_repair_scale,
int32  num_clipped,
int32  count,
int32  num_self_repaired,
int32  num_backpropped 
)
inline

Definition at line 1251 of file nnet-simple-component.h.

References ClipGradientComponent::Init().

1259  {
1260  Init(dim, clipping_threshold, norm_based_clipping,
1261  self_repair_clipped_proportion_threshold,
1262  self_repair_target,
1263  self_repair_scale,
1264  num_clipped, count,
1265  num_self_repaired, num_backpropped);}
void Init(int32 dim, BaseFloat clipping_threshold, bool norm_based_clipping, BaseFloat self_repair_clipped_proportion_threshold, BaseFloat self_repair_target, BaseFloat self_repair_scale, int32 num_clipped, int32 count, int32 num_self_repaired, int32 num_backpropped)
const size_t count

Definition at line 1267 of file nnet-simple-component.h.

Referenced by ClipGradientComponent::Copy().

virtual ~ClipGradientComponent ( )
inlinevirtual

Definition at line 1325 of file nnet-simple-component.h.

References ClipGradientComponent::debug_info_, KALDI_LOG, ClipGradientComponent::num_backpropped_, and ClipGradientComponent::num_self_repaired_.

1325  {
1326  if (num_self_repaired_ > 0)
1327  KALDI_LOG << "ClipGradientComponent(node_name=" << debug_info_
1328  << ")'s self-repair was activated " << num_self_repaired_
1329  << " time(s) out of " << num_backpropped_
1330  << " times of calling Backprop() in this training job.";
1331  }
#define KALDI_LOG
Definition: kaldi-error.h:133

Member Function Documentation

void Add ( BaseFloat  alpha,
const Component other 
)
virtual

This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters.

– a NonlinearComponent (or another component that stores stats, like BatchNormComponent)– it relates to adding stats. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 752 of file nnet-simple-component.cc.

References ClipGradientComponent::count_, KALDI_ASSERT, and ClipGradientComponent::num_clipped_.

752  {
753  const ClipGradientComponent *other =
754  dynamic_cast<const ClipGradientComponent*>(&other_in);
755  KALDI_ASSERT(other != NULL);
756  count_ += alpha * other->count_;
757  num_clipped_ += alpha * other->num_clipped_;
758 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 597 of file nnet-simple-component.cc.

References CuVectorBase< Real >::AddDiagMat2(), CuMatrixBase< Real >::ApplyCeiling(), CuMatrixBase< Real >::ApplyFloor(), ClipGradientComponent::clipping_threshold_, CuMatrixBase< Real >::CopyFromMat(), ClipGradientComponent::count_, kaldi::kNoTrans, CuMatrixBase< Real >::MulRowsVec(), ClipGradientComponent::norm_based_clipping_, ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_clipped_, CuMatrixBase< Real >::NumRows(), ClipGradientComponent::RepairGradients(), and CuMatrixBase< Real >::SetZero().

605  {
606  // the following statement will do nothing if in_deriv and out_deriv have same
607  // memory.
608  in_deriv->CopyFromMat(out_deriv);
609 
610  ClipGradientComponent *to_update =
611  dynamic_cast<ClipGradientComponent*>(to_update_in);
612 
613  if (clipping_threshold_ > 0) {
614  if (norm_based_clipping_) {
615  // each row in the derivative matrix, which corresponds to one sample in
616  // the mini-batch, is scaled to have a max-norm of clipping_threshold_
617  CuVector<BaseFloat> clipping_scales(in_deriv->NumRows());
618  clipping_scales.AddDiagMat2(pow(clipping_threshold_, -2), *in_deriv,
619  kNoTrans, 0.0);
620  // now clipping_scales contains the squared (norm of each row divided by
621  // clipping_threshold)
622  int32 num_not_scaled;
623  clipping_scales.ApplyFloor(1.0, &num_not_scaled);
624  // now clipping_scales contains min(1,
625  // squared-(norm/clipping_threshold))
626  if (num_not_scaled != clipping_scales.Dim()) {
627  clipping_scales.ApplyPow(-0.5);
628  // now clipping_scales contains max(1,
629  // clipping_threshold/vector_norm)
630  in_deriv->MulRowsVec(clipping_scales);
631  if (to_update != NULL)
632  to_update->num_clipped_ += (clipping_scales.Dim() - num_not_scaled);
633  }
634  if (to_update != NULL)
635  to_update->count_ += clipping_scales.Dim();
636  } else {
637  // each element of the derivative matrix, is clipped to be below the
638  // clipping_threshold_
640  in_deriv->ApplyFloor(-1 * clipping_threshold_);
641  }
642 
643  if (to_update != NULL) {
644  to_update->num_backpropped_ += 1;
645  RepairGradients(debug_info, in_value, in_deriv, to_update);
646  }
647  } else if (clipping_threshold_ == 0.0) {
648  in_deriv->SetZero();
649  }
650 }
void ApplyCeiling(Real ceiling_val)
Definition: cu-matrix.cc:2572
void CopyFromMat(const MatrixBase< OtherReal > &src, MatrixTransposeType trans=kNoTrans)
Definition: cu-matrix.cc:339
void MulRowsVec(const CuVectorBase< Real > &scale)
scale i'th row by scale[i]
Definition: cu-matrix.cc:779
void ApplyFloor(Real floor_val)
Definition: cu-matrix.cc:2554
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:214
void SetZero()
Math operations, some calling kernels.
Definition: cu-matrix.cc:476
void RepairGradients(const std::string &debug_info, const CuMatrixBase< BaseFloat > &in_value, CuMatrixBase< BaseFloat > *in_deriv, ClipGradientComponent *to_update) const
virtual Component* Copy ( ) const
inlinevirtual

Copies component (deep copy).

Implements Component.

Definition at line 1294 of file nnet-simple-component.h.

References ClipGradientComponent::ClipGradientComponent(), ClipGradientComponent::clipping_threshold_, ClipGradientComponent::count_, ClipGradientComponent::dim_, ClipGradientComponent::norm_based_clipping_, ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_clipped_, ClipGradientComponent::num_self_repaired_, ClipGradientComponent::self_repair_clipped_proportion_threshold_, ClipGradientComponent::self_repair_scale_, and ClipGradientComponent::self_repair_target_.

1294  {
1295  return new ClipGradientComponent(dim_,
1301  num_clipped_,
1302  count_,
1304  num_backpropped_);}
std::string Info ( ) const
virtual

Returns some text-form information about this component, for diagnostics.

Starts with the type of the component. E.g. "SigmoidComponent dim=900", although most components will have much more info.

Reimplemented from Component.

Definition at line 519 of file nnet-simple-component.cc.

References ClipGradientComponent::clipping_threshold_, ClipGradientComponent::count_, ClipGradientComponent::dim_, ClipGradientComponent::norm_based_clipping_, ClipGradientComponent::num_clipped_, ClipGradientComponent::self_repair_clipped_proportion_threshold_, ClipGradientComponent::self_repair_scale_, ClipGradientComponent::self_repair_target_, and ClipGradientComponent::Type().

519  {
520  std::ostringstream stream;
521  stream << Type() << ", dim=" << dim_
522  << ", norm-based-clipping="
523  << (norm_based_clipping_ ? "true" : "false")
524  << ", clipping-threshold=" << clipping_threshold_
525  << ", clipped-proportion="
526  << (count_ > 0 ? static_cast<BaseFloat>(num_clipped_)/count_ : 0);
527  if (self_repair_scale_ != 0.0)
528  stream << ", self-repair-clipped-proportion-threshold="
530  << ", self-repair-target=" << self_repair_target_
531  << ", self-repair-scale=" << self_repair_scale_;
532  return stream.str();
533 }
virtual std::string Type() const
Returns a string such as "SigmoidComponent", describing the type of the object.
void Init ( int32  dim,
BaseFloat  clipping_threshold,
bool  norm_based_clipping,
BaseFloat  self_repair_clipped_proportion_threshold,
BaseFloat  self_repair_target,
BaseFloat  self_repair_scale,
int32  num_clipped,
int32  count,
int32  num_self_repaired,
int32  num_backpropped 
)

Definition at line 535 of file nnet-simple-component.cc.

References ClipGradientComponent::clipping_threshold_, count, ClipGradientComponent::count_, ClipGradientComponent::dim_, KALDI_ASSERT, ClipGradientComponent::norm_based_clipping_, ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_clipped_, ClipGradientComponent::num_self_repaired_, ClipGradientComponent::self_repair_clipped_proportion_threshold_, ClipGradientComponent::self_repair_scale_, and ClipGradientComponent::self_repair_target_.

Referenced by ClipGradientComponent::ClipGradientComponent(), and ClipGradientComponent::InitFromConfig().

544  {
545  KALDI_ASSERT(clipping_threshold >= 0 && dim > 0 &&
546  self_repair_clipped_proportion_threshold >= 0.0 &&
547  self_repair_target >= 0.0 && self_repair_scale >= 0.0);
548  dim_ = dim;
549  norm_based_clipping_ = norm_based_clipping;
550  clipping_threshold_ = clipping_threshold;
552  self_repair_clipped_proportion_threshold;
553  self_repair_target_ = self_repair_target;
554  self_repair_scale_ = self_repair_scale;
555  num_clipped_ = num_clipped;
556  count_ = count;
557  num_self_repaired_ = num_self_repaired;
558  num_backpropped_ = num_backpropped;
559 }
const size_t count
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void InitFromConfig ( ConfigLine cfl)
virtual

Initialize, from a ConfigLine object.

Parameters
[in]cflA ConfigLine containing any parameters that are needed for initialization. For example: "dim=100 param-stddev=0.1"

Implements Component.

Definition at line 561 of file nnet-simple-component.cc.

References ConfigLine::GetValue(), ConfigLine::HasUnusedValues(), ClipGradientComponent::Init(), KALDI_ERR, ClipGradientComponent::Type(), and ConfigLine::WholeLine().

561  {
562  int32 dim = 0;
563  bool ok = cfl->GetValue("dim", &dim);
564  bool norm_based_clipping = false;
565  BaseFloat clipping_threshold = 15.0;
566  BaseFloat self_repair_clipped_proportion_threshold = 0.01;
567  BaseFloat self_repair_target = 0.0;
568  BaseFloat self_repair_scale = 1.0;
569  cfl->GetValue("clipping-threshold", &clipping_threshold);
570  cfl->GetValue("norm-based-clipping", &norm_based_clipping);
571  cfl->GetValue("self-repair-clipped-proportion-threshold",
572  &self_repair_clipped_proportion_threshold);
573  cfl->GetValue("self-repair-target",
574  &self_repair_target);
575  cfl->GetValue("self-repair-scale", &self_repair_scale);
576  if (!ok || cfl->HasUnusedValues() ||
577  clipping_threshold < 0 || dim <= 0 ||
578  self_repair_clipped_proportion_threshold < 0.0 ||
579  self_repair_target < 0.0 || self_repair_scale < 0.0)
580  KALDI_ERR << "Invalid initializer for layer of type "
581  << Type() << ": \"" << cfl->WholeLine() << "\"";
582  Init(dim, clipping_threshold, norm_based_clipping,
583  self_repair_clipped_proportion_threshold,
584  self_repair_target,
585  self_repair_scale, 0, 0, 0, 0);
586 }
void Init(int32 dim, BaseFloat clipping_threshold, bool norm_based_clipping, BaseFloat self_repair_clipped_proportion_threshold, BaseFloat self_repair_target, BaseFloat self_repair_scale, int32 num_clipped, int32 count, int32 num_self_repaired, int32 num_backpropped)
virtual std::string Type() const
Returns a string such as "SigmoidComponent", describing the type of the object.
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:127
virtual int32 InputDim ( ) const
inlinevirtual

Returns input-dimension of this component.

Implements Component.

Definition at line 1275 of file nnet-simple-component.h.

References ClipGradientComponent::dim_.

ClipGradientComponent& operator= ( const ClipGradientComponent other)
private
virtual int32 OutputDim ( ) const
inlinevirtual

Returns output-dimension of this component.

Implements Component.

Definition at line 1276 of file nnet-simple-component.h.

References ClipGradientComponent::dim_.

void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 588 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::CopyFromMat().

591  {
592  out->CopyFromMat(in);
593  return NULL;
594 }
void CopyFromMat(const MatrixBase< OtherReal > &src, MatrixTransposeType trans=kNoTrans)
Definition: cu-matrix.cc:339
virtual int32 Properties ( ) const
inlinevirtual
void Read ( std::istream &  is,
bool  binary 
)
virtual

Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed.

Implements Component.

Definition at line 453 of file nnet-simple-component.cc.

References ClipGradientComponent::clipping_threshold_, ClipGradientComponent::count_, ClipGradientComponent::dim_, kaldi::nnet3::ExpectOneOrTwoTokens(), kaldi::nnet3::ExpectToken(), KALDI_ASSERT, ClipGradientComponent::norm_based_clipping_, ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_clipped_, ClipGradientComponent::num_self_repaired_, kaldi::ReadBasicType(), kaldi::ReadToken(), ClipGradientComponent::self_repair_clipped_proportion_threshold_, ClipGradientComponent::self_repair_scale_, and ClipGradientComponent::self_repair_target_.

453  {
454  // might not see the "<NaturalGradientAffineComponent>" part because
455  // of how ReadNew() works.
456  ExpectOneOrTwoTokens(is, binary, "<ClipGradientComponent>",
457  "<Dim>");
458  ReadBasicType(is, binary, &dim_);
459  ExpectToken(is, binary, "<ClippingThreshold>");
460  ReadBasicType(is, binary, &clipping_threshold_);
461  ExpectToken(is, binary, "<NormBasedClipping>");
462  ReadBasicType(is, binary, &norm_based_clipping_);
463  std::string token;
464  ReadToken(is, binary, &token);
465  if (token == "<SelfRepairClippedProportionThreshold>") {
467  ExpectToken(is, binary, "<SelfRepairTarget>");
468  ReadBasicType(is, binary, &self_repair_target_);
469  ExpectToken(is, binary, "<SelfRepairScale>");
470  ReadBasicType(is, binary, &self_repair_scale_);
471  ExpectToken(is, binary, "<NumElementsClipped>");
472  } else {
474  self_repair_target_ = 0.0;
475  self_repair_scale_ = 0.0;
476  KALDI_ASSERT(token == "<NumElementsClipped>");
477  }
478  ReadBasicType(is, binary, &num_clipped_);
479  ExpectToken(is, binary, "<NumElementsProcessed>");
480  ReadBasicType(is, binary, &count_);
481  ReadToken(is, binary, &token);
482  if (token == "<NumSelfRepaired>") {
483  ReadBasicType(is, binary, &num_self_repaired_);
484  ExpectToken(is, binary, "<NumBackpropped>");
485  ReadBasicType(is, binary, &num_backpropped_);
486  ExpectToken(is, binary, "</ClipGradientComponent>");
487  } else {
488  num_self_repaired_ = 0;
489  num_backpropped_ = 0;
490  KALDI_ASSERT(token == "</ClipGradientComponent>");
491  }
492 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ExpectOneOrTwoTokens(std::istream &is, bool binary, const std::string &token1, const std::string &token2)
This function is like ExpectToken but for two tokens, and it will either accept token1 and then token...
Definition: nnet-parse.cc:224
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void RepairGradients ( const std::string &  debug_info,
const CuMatrixBase< BaseFloat > &  in_value,
CuMatrixBase< BaseFloat > *  in_deriv,
ClipGradientComponent to_update 
) const
private

Definition at line 660 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::Add(), CuVectorBase< Real >::AddDiagMat2(), CuMatrixBase< Real >::AddMat(), CuMatrixBase< Real >::ApplyFloor(), CuMatrixBase< Real >::ApplyHeaviside(), CuMatrixBase< Real >::ApplyPowAbs(), ClipGradientComponent::count_, ClipGradientComponent::debug_info_, KALDI_ASSERT, KALDI_LOG, kaldi::kNoTrans, CuMatrixBase< Real >::MulElements(), ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_clipped_, ClipGradientComponent::num_self_repaired_, CuMatrixBase< Real >::NumRows(), kaldi::RandUniform(), CuMatrixBase< Real >::Scale(), ClipGradientComponent::self_repair_clipped_proportion_threshold_, ClipGradientComponent::self_repair_scale_, and ClipGradientComponent::self_repair_target_.

Referenced by ClipGradientComponent::Backprop().

663  {
664  KALDI_ASSERT(to_update != NULL);
665 
666  // we use this 'repair_probability' (hardcoded for now) to limit
667  // this code to running on about half of the minibatches.
668  BaseFloat repair_probability = 0.5;
670  self_repair_scale_ == 0.0 || count_ == 0 ||
671  RandUniform() > repair_probability)
672  return;
673 
675 
676  BaseFloat clipped_proportion =
677  (count_ > 0 ? static_cast<BaseFloat>(num_clipped_) / count_ : 0);
678  // in-deriv would be modified only when clipped_proportion exceeds the
679  // threshold
680  if (clipped_proportion <= self_repair_clipped_proportion_threshold_)
681  return;
682 
683  to_update->num_self_repaired_ += 1;
684  if (to_update->debug_info_ == "") // get the component-node name
685  to_update->debug_info_ = debug_info;
686  if (to_update->num_self_repaired_ == 1)
687  KALDI_LOG << "ClipGradientComponent(node_name=" << debug_info
688  << ")'s self-repair was activated as the first time at the "
689  << to_update->num_backpropped_
690  << "-th call of Backprop() in this training job.";
691 
692  // sign_mat = sign(in_value), i.e.,
693  // An element in sign_mat is 1 if its corresponding element in in_value > 0,
694  // or -1 otherwise
695  CuMatrix<BaseFloat> sign_mat(in_value);
696  sign_mat.ApplyHeaviside();
697  sign_mat.Scale(2.0);
698  sign_mat.Add(-1.0);
699 
700  // repair_mat =
701  // floor(abs(in_value) - self_repair_target_, 0) .* sign(in_value)
702  CuMatrix<BaseFloat> repair_mat(in_value);
703  repair_mat.ApplyPowAbs(1.0);
704  repair_mat.Add(-self_repair_target_);
705  repair_mat.ApplyFloor(0.0);
706  repair_mat.MulElements(sign_mat);
707 
708  // magnitude =
709  // self_repair_scale_ * clipped_proportion * average norm of in-deriv
710  CuVector<BaseFloat> in_deriv_norm_vec(in_deriv->NumRows());
711  in_deriv_norm_vec.AddDiagMat2(1.0, *in_deriv, kNoTrans, 0.0);
712  in_deriv_norm_vec.ApplyPow(0.5);
713  double in_deriv_norm_sum = in_deriv_norm_vec.Sum();
714  BaseFloat magnitude = self_repair_scale_ * clipped_proportion *
715  (in_deriv_norm_sum / in_deriv_norm_vec.Dim());
716 
717  CuVector<BaseFloat> repair_mat_norm_vec(repair_mat.NumRows());
718  repair_mat_norm_vec.AddDiagMat2(1.0, repair_mat, kNoTrans, 0.0);
719  repair_mat_norm_vec.ApplyPow(0.5);
720  double repair_mat_norm_sum = repair_mat_norm_vec.Sum();
721  double scale = 0.0;
722  if (repair_mat_norm_sum != 0.0)
723  scale = magnitude / (repair_mat_norm_sum / repair_mat_norm_vec.Dim());
724  // repair_mat is scaled so that on average the rows have the norm
725  // (magnitude / repair_probability). This will give higher magnitude of
726  // self-repair to input vectors that have larger absolute value, which tend to
727  // be those that are diverging.
728  in_deriv->AddMat(-scale / repair_probability, repair_mat);
729  CuVector<BaseFloat> in_deriv_repaired_norm_vec(in_deriv->NumRows());
730  in_deriv_repaired_norm_vec.AddDiagMat2(1.0, *in_deriv, kNoTrans, 0.0);
731  in_deriv_repaired_norm_vec.ApplyPow(0.5);
732  // scale in_deriv to have the same norm as that before adding the self-repair
733  // term, in order to avoid increase of the norm caused by self-repair,
734  // which may incur more clip of gradient and thus more self-repair
735  double in_deriv_repaired_norm_sum = in_deriv_repaired_norm_vec.Sum();
736  if (in_deriv_repaired_norm_sum != 0.0)
737  in_deriv->Scale(in_deriv_norm_sum / in_deriv_repaired_norm_sum);
738 }
float RandUniform(struct RandomState *state=NULL)
Returns a random number strictly between 0 and 1.
Definition: kaldi-math.h:151
void Scale(Real value)
Definition: cu-matrix.cc:610
float BaseFloat
Definition: kaldi-types.h:29
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:214
void AddMat(Real alpha, const CuMatrixBase< Real > &A, MatrixTransposeType trans=kNoTrans)
*this += alpha * A
Definition: cu-matrix.cc:941
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
#define KALDI_LOG
Definition: kaldi-error.h:133
void Scale ( BaseFloat  scale)
virtual

This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent.

– a Nonlinear component (or another component that stores stats, like BatchNormComponent)– it relates to scaling activation stats, not parameters. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 747 of file nnet-simple-component.cc.

References ClipGradientComponent::count_, and ClipGradientComponent::num_clipped_.

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 1285 of file nnet-simple-component.h.

Referenced by ClipGradientComponent::Info(), and ClipGradientComponent::InitFromConfig().

1285 { return "ClipGradientComponent"; }
void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Implements Component.

Definition at line 494 of file nnet-simple-component.cc.

References ClipGradientComponent::clipping_threshold_, ClipGradientComponent::count_, ClipGradientComponent::dim_, ClipGradientComponent::norm_based_clipping_, ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_clipped_, ClipGradientComponent::num_self_repaired_, ClipGradientComponent::self_repair_clipped_proportion_threshold_, ClipGradientComponent::self_repair_scale_, ClipGradientComponent::self_repair_target_, kaldi::WriteBasicType(), and kaldi::WriteToken().

494  {
495  WriteToken(os, binary, "<ClipGradientComponent>");
496  WriteToken(os, binary, "<Dim>");
497  WriteBasicType(os, binary, dim_);
498  WriteToken(os, binary, "<ClippingThreshold>");
499  WriteBasicType(os, binary, clipping_threshold_);
500  WriteToken(os, binary, "<NormBasedClipping>");
502  WriteToken(os, binary, "<SelfRepairClippedProportionThreshold>");
504  WriteToken(os, binary, "<SelfRepairTarget>");
505  WriteBasicType(os, binary, self_repair_target_);
506  WriteToken(os, binary, "<SelfRepairScale>");
507  WriteBasicType(os, binary, self_repair_scale_);
508  WriteToken(os, binary, "<NumElementsClipped>");
509  WriteBasicType(os, binary, num_clipped_);
510  WriteToken(os, binary, "<NumElementsProcessed>");
511  WriteBasicType(os, binary, count_);
512  WriteToken(os, binary, "<NumSelfRepaired>");
513  WriteBasicType(os, binary, num_self_repaired_);
514  WriteToken(os, binary, "<NumBackpropped>");
515  WriteBasicType(os, binary, num_backpropped_);
516  WriteToken(os, binary, "</ClipGradientComponent>");
517 }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34
void ZeroStats ( )
virtual

Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero.

Other components that store other types of statistics (e.g. regarding gradient clipping) should implement ZeroStats() also.

Reimplemented from Component.

Definition at line 740 of file nnet-simple-component.cc.

References ClipGradientComponent::count_, ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_clipped_, and ClipGradientComponent::num_self_repaired_.

Member Data Documentation

std::string debug_info_
private

The documentation for this class was generated from the following files: