ClipGradientComponent Class Reference

#include <nnet-simple-component.h>

Inheritance diagram for ClipGradientComponent:
Collaboration diagram for ClipGradientComponent:

Public Member Functions

 ClipGradientComponent (int32 dim, BaseFloat clipping_threshold, bool norm_based_clipping, BaseFloat self_repair_clipped_proportion_threshold, BaseFloat self_repair_target, BaseFloat self_repair_scale, int32 num_clipped, int32 count, int32 num_self_repaired, int32 num_backpropped)
 
 ClipGradientComponent ()
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
void Init (int32 dim, BaseFloat clipping_threshold, bool norm_based_clipping, BaseFloat self_repair_clipped_proportion_threshold, BaseFloat self_repair_target, BaseFloat self_repair_scale, int32 num_clipped, int32 count, int32 num_self_repaired, int32 num_backpropped)
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual ~ClipGradientComponent ()
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
virtual void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
 Component ()
 
virtual ~Component ()
 

Protected Attributes

int32 num_clipped_
 
int32 count_
 
int32 num_self_repaired_
 
int32 num_backpropped_
 

Private Member Functions

void RepairGradients (const std::string &debug_info, const CuMatrixBase< BaseFloat > &in_value, CuMatrixBase< BaseFloat > *in_deriv, ClipGradientComponent *to_update) const
 
ClipGradientComponentoperator= (const ClipGradientComponent &other)
 

Private Attributes

int32 dim_
 
BaseFloat clipping_threshold_
 
bool norm_based_clipping_
 
BaseFloat self_repair_clipped_proportion_threshold_
 
BaseFloat self_repair_target_
 
BaseFloat self_repair_scale_
 
std::string debug_info_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 

Detailed Description

Definition at line 1294 of file nnet-simple-component.h.

Constructor & Destructor Documentation

◆ ClipGradientComponent() [1/2]

ClipGradientComponent ( int32  dim,
BaseFloat  clipping_threshold,
bool  norm_based_clipping,
BaseFloat  self_repair_clipped_proportion_threshold,
BaseFloat  self_repair_target,
BaseFloat  self_repair_scale,
int32  num_clipped,
int32  count,
int32  num_self_repaired,
int32  num_backpropped 
)
inline

Definition at line 1296 of file nnet-simple-component.h.

References PnormComponent::Init().

1304  {
1305  Init(dim, clipping_threshold, norm_based_clipping,
1306  self_repair_clipped_proportion_threshold,
1307  self_repair_target,
1308  self_repair_scale,
1309  num_clipped, count,
1310  num_self_repaired, num_backpropped);}
void Init(int32 dim, BaseFloat clipping_threshold, bool norm_based_clipping, BaseFloat self_repair_clipped_proportion_threshold, BaseFloat self_repair_target, BaseFloat self_repair_scale, int32 num_clipped, int32 count, int32 num_self_repaired, int32 num_backpropped)
const size_t count

◆ ClipGradientComponent() [2/2]

Definition at line 1312 of file nnet-simple-component.h.

◆ ~ClipGradientComponent()

virtual ~ClipGradientComponent ( )
inlinevirtual

Definition at line 1370 of file nnet-simple-component.h.

References KALDI_LOG.

1370  {
1371  if (num_self_repaired_ > 0)
1372  KALDI_LOG << "ClipGradientComponent(node_name=" << debug_info_
1373  << ")'s self-repair was activated " << num_self_repaired_
1374  << " time(s) out of " << num_backpropped_
1375  << " times of calling Backprop() in this training job.";
1376  }
#define KALDI_LOG
Definition: kaldi-error.h:153

Member Function Documentation

◆ Add()

void Add ( BaseFloat  alpha,
const Component other 
)
virtual

This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters.

– a NonlinearComponent (or another component that stores stats, like BatchNormComponent)– it relates to adding stats. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 836 of file nnet-simple-component.cc.

References ClipGradientComponent::count_, KALDI_ASSERT, and ClipGradientComponent::num_clipped_.

836  {
837  const ClipGradientComponent *other =
838  dynamic_cast<const ClipGradientComponent*>(&other_in);
839  KALDI_ASSERT(other != NULL);
840  count_ += alpha * other->count_;
841  num_clipped_ += alpha * other->num_clipped_;
842 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Backprop()

void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 680 of file nnet-simple-component.cc.

References CuVectorBase< Real >::AddDiagMat2(), CuMatrixBase< Real >::ApplyCeiling(), CuMatrixBase< Real >::ApplyFloor(), CuMatrixBase< Real >::CopyFromMat(), ClipGradientComponent::count_, kaldi::kNoTrans, CuMatrixBase< Real >::MulRowsVec(), ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_clipped_, CuMatrixBase< Real >::NumRows(), NVTX_RANGE, and CuMatrixBase< Real >::SetZero().

688  {
689  NVTX_RANGE("ClipGradientComponent::Backprop");
690  // the following statement will do nothing if in_deriv and out_deriv have same
691  // memory.
692  in_deriv->CopyFromMat(out_deriv);
693 
694  ClipGradientComponent *to_update =
695  dynamic_cast<ClipGradientComponent*>(to_update_in);
696 
697  if (clipping_threshold_ > 0) {
698  if (norm_based_clipping_) {
699  // each row in the derivative matrix, which corresponds to one sample in
700  // the mini-batch, is scaled to have a max-norm of clipping_threshold_
701  CuVector<BaseFloat> clipping_scales(in_deriv->NumRows());
702  clipping_scales.AddDiagMat2(pow(clipping_threshold_, -2), *in_deriv,
703  kNoTrans, 0.0);
704  // now clipping_scales contains the squared (norm of each row divided by
705  // clipping_threshold)
706  int32 num_not_scaled;
707  clipping_scales.ApplyFloor(1.0, &num_not_scaled);
708  // now clipping_scales contains min(1,
709  // squared-(norm/clipping_threshold))
710  if (num_not_scaled != clipping_scales.Dim()) {
711  clipping_scales.ApplyPow(-0.5);
712  // now clipping_scales contains max(1,
713  // clipping_threshold/vector_norm)
714  in_deriv->MulRowsVec(clipping_scales);
715  if (to_update != NULL)
716  to_update->num_clipped_ += (clipping_scales.Dim() - num_not_scaled);
717  }
718  if (to_update != NULL)
719  to_update->count_ += clipping_scales.Dim();
720  } else {
721  // each element of the derivative matrix, is clipped to be below the
722  // clipping_threshold_
723  in_deriv->ApplyCeiling(clipping_threshold_);
724  in_deriv->ApplyFloor(-1 * clipping_threshold_);
725  }
726 
727  if (to_update != NULL) {
728  to_update->num_backpropped_ += 1;
729  RepairGradients(debug_info, in_value, in_deriv, to_update);
730  }
731  } else if (clipping_threshold_ == 0.0) {
732  in_deriv->SetZero();
733  }
734 }
kaldi::int32 int32
void RepairGradients(const std::string &debug_info, const CuMatrixBase< BaseFloat > &in_value, CuMatrixBase< BaseFloat > *in_deriv, ClipGradientComponent *to_update) const
#define NVTX_RANGE(name)
Definition: cu-common.h:143

◆ Copy()

virtual Component* Copy ( ) const
inlinevirtual

Copies component (deep copy).

Implements Component.

Definition at line 1339 of file nnet-simple-component.h.

References Component::Add(), PnormComponent::Backprop(), Component::Info(), PnormComponent::Propagate(), PnormComponent::Read(), Component::Scale(), and PnormComponent::Write().

1339  {
1340  return new ClipGradientComponent(dim_,
1346  num_clipped_,
1347  count_,
1349  num_backpropped_);}

◆ Info()

std::string Info ( ) const
virtual

Returns some text-form information about this component, for diagnostics.

Starts with the type of the component. E.g. "SigmoidComponent dim=900", although most components will have much more info.

Reimplemented from Component.

Definition at line 602 of file nnet-simple-component.cc.

References DropoutComponent::dim_, and DropoutComponent::Type().

602  {
603  std::ostringstream stream;
604  stream << Type() << ", dim=" << dim_
605  << ", norm-based-clipping="
606  << (norm_based_clipping_ ? "true" : "false")
607  << ", clipping-threshold=" << clipping_threshold_
608  << ", clipped-proportion="
609  << (count_ > 0 ? static_cast<BaseFloat>(num_clipped_)/count_ : 0);
610  if (self_repair_scale_ != 0.0)
611  stream << ", self-repair-clipped-proportion-threshold="
613  << ", self-repair-target=" << self_repair_target_
614  << ", self-repair-scale=" << self_repair_scale_;
615  return stream.str();
616 }
virtual std::string Type() const
Returns a string such as "SigmoidComponent", describing the type of the object.

◆ Init()

void Init ( int32  dim,
BaseFloat  clipping_threshold,
bool  norm_based_clipping,
BaseFloat  self_repair_clipped_proportion_threshold,
BaseFloat  self_repair_target,
BaseFloat  self_repair_scale,
int32  num_clipped,
int32  count,
int32  num_self_repaired,
int32  num_backpropped 
)

Definition at line 618 of file nnet-simple-component.cc.

References count, DropoutComponent::dim_, and KALDI_ASSERT.

627  {
628  KALDI_ASSERT(clipping_threshold >= 0 && dim > 0 &&
629  self_repair_clipped_proportion_threshold >= 0.0 &&
630  self_repair_target >= 0.0 && self_repair_scale >= 0.0);
631  dim_ = dim;
632  norm_based_clipping_ = norm_based_clipping;
633  clipping_threshold_ = clipping_threshold;
635  self_repair_clipped_proportion_threshold;
636  self_repair_target_ = self_repair_target;
637  self_repair_scale_ = self_repair_scale;
638  num_clipped_ = num_clipped;
639  count_ = count;
640  num_self_repaired_ = num_self_repaired;
641  num_backpropped_ = num_backpropped;
642 }
const size_t count
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ InitFromConfig()

void InitFromConfig ( ConfigLine cfl)
virtual

Initialize, from a ConfigLine object.

Parameters
[in]cflA ConfigLine containing any parameters that are needed for initialization. For example: "dim=100 param-stddev=0.1"

Implements Component.

Definition at line 644 of file nnet-simple-component.cc.

References ConfigLine::GetValue(), ConfigLine::HasUnusedValues(), DropoutComponent::Init(), KALDI_ERR, DropoutComponent::Type(), and ConfigLine::WholeLine().

644  {
645  int32 dim = 0;
646  bool ok = cfl->GetValue("dim", &dim);
647  bool norm_based_clipping = false;
648  BaseFloat clipping_threshold = 15.0;
649  BaseFloat self_repair_clipped_proportion_threshold = 0.01;
650  BaseFloat self_repair_target = 0.0;
651  BaseFloat self_repair_scale = 1.0;
652  cfl->GetValue("clipping-threshold", &clipping_threshold);
653  cfl->GetValue("norm-based-clipping", &norm_based_clipping);
654  cfl->GetValue("self-repair-clipped-proportion-threshold",
655  &self_repair_clipped_proportion_threshold);
656  cfl->GetValue("self-repair-target",
657  &self_repair_target);
658  cfl->GetValue("self-repair-scale", &self_repair_scale);
659  if (!ok || cfl->HasUnusedValues() ||
660  clipping_threshold < 0 || dim <= 0 ||
661  self_repair_clipped_proportion_threshold < 0.0 ||
662  self_repair_target < 0.0 || self_repair_scale < 0.0)
663  KALDI_ERR << "Invalid initializer for layer of type "
664  << Type() << ": \"" << cfl->WholeLine() << "\"";
665  Init(dim, clipping_threshold, norm_based_clipping,
666  self_repair_clipped_proportion_threshold,
667  self_repair_target,
668  self_repair_scale, 0, 0, 0, 0);
669 }
void Init(int32 dim, BaseFloat clipping_threshold, bool norm_based_clipping, BaseFloat self_repair_clipped_proportion_threshold, BaseFloat self_repair_target, BaseFloat self_repair_scale, int32 num_clipped, int32 count, int32 num_self_repaired, int32 num_backpropped)
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:147
virtual std::string Type() const
Returns a string such as "SigmoidComponent", describing the type of the object.

◆ InputDim()

virtual int32 InputDim ( ) const
inlinevirtual

Returns input-dimension of this component.

Implements Component.

Definition at line 1320 of file nnet-simple-component.h.

◆ operator=()

ClipGradientComponent& operator= ( const ClipGradientComponent other)
private

◆ OutputDim()

virtual int32 OutputDim ( ) const
inlinevirtual

Returns output-dimension of this component.

Implements Component.

Definition at line 1321 of file nnet-simple-component.h.

References count, PnormComponent::Init(), and PnormComponent::InitFromConfig().

◆ Propagate()

void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 671 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::CopyFromMat().

674  {
675  out->CopyFromMat(in);
676  return NULL;
677 }

◆ Properties()

virtual int32 Properties ( ) const
inlinevirtual

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)
virtual

Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed.

Implements Component.

Definition at line 536 of file nnet-simple-component.cc.

References DropoutComponent::dim_, kaldi::ExpectOneOrTwoTokens(), kaldi::nnet3::ExpectToken(), KALDI_ASSERT, kaldi::ReadBasicType(), and kaldi::ReadToken().

536  {
537  // might not see the "<NaturalGradientAffineComponent>" part because
538  // of how ReadNew() works.
539  ExpectOneOrTwoTokens(is, binary, "<ClipGradientComponent>",
540  "<Dim>");
541  ReadBasicType(is, binary, &dim_);
542  ExpectToken(is, binary, "<ClippingThreshold>");
543  ReadBasicType(is, binary, &clipping_threshold_);
544  ExpectToken(is, binary, "<NormBasedClipping>");
545  ReadBasicType(is, binary, &norm_based_clipping_);
546  std::string token;
547  ReadToken(is, binary, &token);
548  if (token == "<SelfRepairClippedProportionThreshold>") {
550  ExpectToken(is, binary, "<SelfRepairTarget>");
551  ReadBasicType(is, binary, &self_repair_target_);
552  ExpectToken(is, binary, "<SelfRepairScale>");
553  ReadBasicType(is, binary, &self_repair_scale_);
554  ExpectToken(is, binary, "<NumElementsClipped>");
555  } else {
557  self_repair_target_ = 0.0;
558  self_repair_scale_ = 0.0;
559  KALDI_ASSERT(token == "<NumElementsClipped>");
560  }
561  ReadBasicType(is, binary, &num_clipped_);
562  ExpectToken(is, binary, "<NumElementsProcessed>");
563  ReadBasicType(is, binary, &count_);
564  ReadToken(is, binary, &token);
565  if (token == "<NumSelfRepaired>") {
566  ReadBasicType(is, binary, &num_self_repaired_);
567  ExpectToken(is, binary, "<NumBackpropped>");
568  ReadBasicType(is, binary, &num_backpropped_);
569  ExpectToken(is, binary, "</ClipGradientComponent>");
570  } else {
571  num_self_repaired_ = 0;
572  num_backpropped_ = 0;
573  KALDI_ASSERT(token == "</ClipGradientComponent>");
574  }
575 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
void ExpectOneOrTwoTokens(std::istream &is, bool binary, const std::string &token1, const std::string &token2)
This function is like ExpectToken but for two tokens, and it will either accept token1 and then token...
Definition: text-utils.cc:536
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ RepairGradients()

void RepairGradients ( const std::string &  debug_info,
const CuMatrixBase< BaseFloat > &  in_value,
CuMatrixBase< BaseFloat > *  in_deriv,
ClipGradientComponent to_update 
) const
private

Definition at line 744 of file nnet-simple-component.cc.

References CuMatrixBase< Real >::Add(), CuVectorBase< Real >::AddDiagMat2(), CuMatrixBase< Real >::AddMat(), CuMatrixBase< Real >::ApplyFloor(), CuMatrixBase< Real >::ApplyHeaviside(), CuMatrixBase< Real >::ApplyPowAbs(), ClipGradientComponent::debug_info_, KALDI_ASSERT, KALDI_LOG, kaldi::kNoTrans, CuMatrixBase< Real >::MulElements(), ClipGradientComponent::num_backpropped_, ClipGradientComponent::num_self_repaired_, CuMatrixBase< Real >::NumRows(), kaldi::RandUniform(), and CuMatrixBase< Real >::Scale().

747  {
748  KALDI_ASSERT(to_update != NULL);
749 
750  // we use this 'repair_probability' (hardcoded for now) to limit
751  // this code to running on about half of the minibatches.
752  BaseFloat repair_probability = 0.5;
754  self_repair_scale_ == 0.0 || count_ == 0 ||
755  RandUniform() > repair_probability)
756  return;
757 
759 
760  BaseFloat clipped_proportion =
761  (count_ > 0 ? static_cast<BaseFloat>(num_clipped_) / count_ : 0);
762  // in-deriv would be modified only when clipped_proportion exceeds the
763  // threshold
764  if (clipped_proportion <= self_repair_clipped_proportion_threshold_)
765  return;
766 
767  to_update->num_self_repaired_ += 1;
768  if (to_update->debug_info_ == "") // get the component-node name
769  to_update->debug_info_ = debug_info;
770  if (to_update->num_self_repaired_ == 1)
771  KALDI_LOG << "ClipGradientComponent(node_name=" << debug_info
772  << ")'s self-repair was activated as the first time at the "
773  << to_update->num_backpropped_
774  << "-th call of Backprop() in this training job.";
775 
776  // sign_mat = sign(in_value), i.e.,
777  // An element in sign_mat is 1 if its corresponding element in in_value > 0,
778  // or -1 otherwise
779  CuMatrix<BaseFloat> sign_mat(in_value);
780  sign_mat.ApplyHeaviside();
781  sign_mat.Scale(2.0);
782  sign_mat.Add(-1.0);
783 
784  // repair_mat =
785  // floor(abs(in_value) - self_repair_target_, 0) .* sign(in_value)
786  CuMatrix<BaseFloat> repair_mat(in_value);
787  repair_mat.ApplyPowAbs(1.0);
788  repair_mat.Add(-self_repair_target_);
789  repair_mat.ApplyFloor(0.0);
790  repair_mat.MulElements(sign_mat);
791 
792  // magnitude =
793  // self_repair_scale_ * clipped_proportion * average norm of in-deriv
794  CuVector<BaseFloat> in_deriv_norm_vec(in_deriv->NumRows());
795  in_deriv_norm_vec.AddDiagMat2(1.0, *in_deriv, kNoTrans, 0.0);
796  in_deriv_norm_vec.ApplyPow(0.5);
797  double in_deriv_norm_sum = in_deriv_norm_vec.Sum();
798  BaseFloat magnitude = self_repair_scale_ * clipped_proportion *
799  (in_deriv_norm_sum / in_deriv_norm_vec.Dim());
800 
801  CuVector<BaseFloat> repair_mat_norm_vec(repair_mat.NumRows());
802  repair_mat_norm_vec.AddDiagMat2(1.0, repair_mat, kNoTrans, 0.0);
803  repair_mat_norm_vec.ApplyPow(0.5);
804  double repair_mat_norm_sum = repair_mat_norm_vec.Sum();
805  double scale = 0.0;
806  if (repair_mat_norm_sum != 0.0)
807  scale = magnitude / (repair_mat_norm_sum / repair_mat_norm_vec.Dim());
808  // repair_mat is scaled so that on average the rows have the norm
809  // (magnitude / repair_probability). This will give higher magnitude of
810  // self-repair to input vectors that have larger absolute value, which tend to
811  // be those that are diverging.
812  in_deriv->AddMat(-scale / repair_probability, repair_mat);
813  CuVector<BaseFloat> in_deriv_repaired_norm_vec(in_deriv->NumRows());
814  in_deriv_repaired_norm_vec.AddDiagMat2(1.0, *in_deriv, kNoTrans, 0.0);
815  in_deriv_repaired_norm_vec.ApplyPow(0.5);
816  // scale in_deriv to have the same norm as that before adding the self-repair
817  // term, in order to avoid increase of the norm caused by self-repair,
818  // which may incur more clip of gradient and thus more self-repair
819  double in_deriv_repaired_norm_sum = in_deriv_repaired_norm_vec.Sum();
820  if (in_deriv_repaired_norm_sum != 0.0)
821  in_deriv->Scale(in_deriv_norm_sum / in_deriv_repaired_norm_sum);
822 }
float RandUniform(struct RandomState *state=NULL)
Returns a random number strictly between 0 and 1.
Definition: kaldi-math.h:151
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define KALDI_LOG
Definition: kaldi-error.h:153

◆ Scale()

void Scale ( BaseFloat  scale)
virtual

This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent.

– a Nonlinear component (or another component that stores stats, like BatchNormComponent)– it relates to scaling activation stats, not parameters. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 831 of file nnet-simple-component.cc.

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 1330 of file nnet-simple-component.h.

1330 { return "ClipGradientComponent"; }

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Implements Component.

Definition at line 577 of file nnet-simple-component.cc.

References DropoutComponent::dim_, kaldi::WriteBasicType(), and kaldi::WriteToken().

577  {
578  WriteToken(os, binary, "<ClipGradientComponent>");
579  WriteToken(os, binary, "<Dim>");
580  WriteBasicType(os, binary, dim_);
581  WriteToken(os, binary, "<ClippingThreshold>");
582  WriteBasicType(os, binary, clipping_threshold_);
583  WriteToken(os, binary, "<NormBasedClipping>");
585  WriteToken(os, binary, "<SelfRepairClippedProportionThreshold>");
587  WriteToken(os, binary, "<SelfRepairTarget>");
588  WriteBasicType(os, binary, self_repair_target_);
589  WriteToken(os, binary, "<SelfRepairScale>");
590  WriteBasicType(os, binary, self_repair_scale_);
591  WriteToken(os, binary, "<NumElementsClipped>");
592  WriteBasicType(os, binary, num_clipped_);
593  WriteToken(os, binary, "<NumElementsProcessed>");
594  WriteBasicType(os, binary, count_);
595  WriteToken(os, binary, "<NumSelfRepaired>");
596  WriteBasicType(os, binary, num_self_repaired_);
597  WriteToken(os, binary, "<NumBackpropped>");
598  WriteBasicType(os, binary, num_backpropped_);
599  WriteToken(os, binary, "</ClipGradientComponent>");
600 }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

◆ ZeroStats()

void ZeroStats ( )
virtual

Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero.

Other components that store other types of statistics (e.g. regarding gradient clipping) should implement ZeroStats() also.

Reimplemented from Component.

Definition at line 824 of file nnet-simple-component.cc.

Member Data Documentation

◆ clipping_threshold_

BaseFloat clipping_threshold_
private

Definition at line 1379 of file nnet-simple-component.h.

◆ count_

int32 count_
protected

◆ debug_info_

std::string debug_info_
private

Definition at line 1395 of file nnet-simple-component.h.

Referenced by ClipGradientComponent::RepairGradients().

◆ dim_

int32 dim_
private

Definition at line 1378 of file nnet-simple-component.h.

◆ norm_based_clipping_

bool norm_based_clipping_
private

Definition at line 1383 of file nnet-simple-component.h.

◆ num_backpropped_

int32 num_backpropped_
protected

◆ num_clipped_

int32 num_clipped_
protected

◆ num_self_repaired_

int32 num_self_repaired_
protected

Definition at line 1422 of file nnet-simple-component.h.

Referenced by ClipGradientComponent::RepairGradients().

◆ self_repair_clipped_proportion_threshold_

BaseFloat self_repair_clipped_proportion_threshold_
private

Definition at line 1388 of file nnet-simple-component.h.

◆ self_repair_scale_

BaseFloat self_repair_scale_
private

Definition at line 1394 of file nnet-simple-component.h.

◆ self_repair_target_

BaseFloat self_repair_target_
private

Definition at line 1392 of file nnet-simple-component.h.


The documentation for this class was generated from the following files: