NaturalGradientAffineComponent Class Reference

#include <nnet-simple-component.h>

Inheritance diagram for NaturalGradientAffineComponent:
Collaboration diagram for NaturalGradientAffineComponent:

Public Member Functions

virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
 NaturalGradientAffineComponent ()
 
void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void FreezeNaturalGradient (bool freeze)
 freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...
 
virtual void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
 NaturalGradientAffineComponent (const NaturalGradientAffineComponent &other)
 
 NaturalGradientAffineComponent (const CuMatrixBase< BaseFloat > &linear_params, const CuVectorBase< BaseFloat > &bias_params)
 
- Public Member Functions inherited from AffineComponent
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
BaseFloat OrthonormalConstraint () const
 
 AffineComponent ()
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void PerturbParams (BaseFloat stddev)
 This function is to be used in testing. More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Computes dot-product between parameters of two instances of a Component. More...
 
virtual int32 NumParameters () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
virtual void SetParams (const CuVectorBase< BaseFloat > &bias, const CuMatrixBase< BaseFloat > &linear)
 
const CuVector< BaseFloat > & BiasParams () const
 
CuVector< BaseFloat > & BiasParams ()
 
const CuMatrix< BaseFloat > & LinearParams () const
 
CuMatrix< BaseFloat > & LinearParams ()
 
 AffineComponent (const AffineComponent &other)
 
 AffineComponent (const CuMatrixBase< BaseFloat > &linear_params, const CuVectorBase< BaseFloat > &bias_params, BaseFloat learning_rate)
 
virtual void Resize (int32 input_dim, int32 output_dim)
 
void Init (int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev)
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
virtual void SetUnderlyingLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More...
 
virtual void SetActualLearningRate (BaseFloat lrate)
 Sets the learning rate directly, bypassing learning_rate_factor_. More...
 
virtual void SetAsGradient ()
 Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More...
 
virtual BaseFloat LearningRateFactor ()
 
virtual void SetLearningRateFactor (BaseFloat lrate_factor)
 
void SetUpdatableConfigs (const UpdatableComponent &other)
 
BaseFloat LearningRate () const
 Gets the learning rate to be used in gradient descent. More...
 
BaseFloat MaxChange () const
 Returns the per-component max-change value, which is interpreted as the maximum change (in l2 norm) in parameters that is allowed per minibatch for this component. More...
 
void SetMaxChange (BaseFloat max_change)
 
BaseFloat L2Regularization () const
 Returns the l2 regularization constant, which may be set in any updatable component (usually from the config file). More...
 
void SetL2Regularization (BaseFloat a)
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 

Private Member Functions

NaturalGradientAffineComponentoperator= (const NaturalGradientAffineComponent &)
 
virtual void Update (const std::string &debug_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 

Private Attributes

OnlineNaturalGradient preconditioner_in_
 
OnlineNaturalGradient preconditioner_out_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from AffineComponent
void Init (std::string matrix_filename)
 
virtual void UpdateSimple (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
const AffineComponentoperator= (const AffineComponent &other)
 
- Protected Member Functions inherited from UpdatableComponent
void InitLearningRatesFromConfig (ConfigLine *cfl)
 
std::string ReadUpdatableCommon (std::istream &is, bool binary)
 
void WriteUpdatableCommon (std::ostream &is, bool binary) const
 
- Protected Attributes inherited from AffineComponent
CuMatrix< BaseFloatlinear_params_
 
CuVector< BaseFloatbias_params_
 
BaseFloat orthonormal_constraint_
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (typically 0.0..0.01) More...
 
BaseFloat learning_rate_factor_
 learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More...
 
BaseFloat l2_regularize_
 L2 regularization constant. More...
 
bool is_gradient_
 True if this component is to be treated as a gradient rather than as parameters. More...
 
BaseFloat max_change_
 configuration value for imposing max-change More...
 

Detailed Description

Definition at line 825 of file nnet-simple-component.h.

Constructor & Destructor Documentation

◆ NaturalGradientAffineComponent() [1/3]

◆ NaturalGradientAffineComponent() [2/3]

Definition at line 2987 of file nnet-simple-component.cc.

2988  :
2989  AffineComponent(other),
2990  preconditioner_in_(other.preconditioner_in_),
2991  preconditioner_out_(other.preconditioner_out_) { }

◆ NaturalGradientAffineComponent() [3/3]

NaturalGradientAffineComponent ( const CuMatrixBase< BaseFloat > &  linear_params,
const CuVectorBase< BaseFloat > &  bias_params 
)

Definition at line 2853 of file nnet-simple-component.cc.

References CuVectorBase< Real >::Dim(), KALDI_ASSERT, CuMatrixBase< Real >::NumRows(), NaturalGradientAffineComponent::preconditioner_in_, NaturalGradientAffineComponent::preconditioner_out_, OnlineNaturalGradient::SetRank(), and OnlineNaturalGradient::SetUpdatePeriod().

2855  :
2856  AffineComponent(linear_params, bias_params, 0.001) {
2857  KALDI_ASSERT(bias_params.Dim() == linear_params.NumRows() &&
2858  bias_params.Dim() != 0);
2859 
2860  // set some default natural gradient configs.
2865 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

Member Function Documentation

◆ Add()

void Add ( BaseFloat  alpha,
const Component other 
)
virtual

This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters.

– a NonlinearComponent (or another component that stores stats, like BatchNormComponent)– it relates to adding stats. Otherwise it will normally do nothing.

Reimplemented from AffineComponent.

Definition at line 3049 of file nnet-simple-component.cc.

References AffineComponent::bias_params_, KALDI_ASSERT, and AffineComponent::linear_params_.

3049  {
3050  const NaturalGradientAffineComponent *other =
3051  dynamic_cast<const NaturalGradientAffineComponent*>(&other_in);
3052  KALDI_ASSERT(other != NULL);
3053  linear_params_.AddMat(alpha, other->linear_params_);
3054  bias_params_.AddVec(alpha, other->bias_params_);
3055 }
CuMatrix< BaseFloat > linear_params_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ ConsolidateMemory()

void ConsolidateMemory ( )
virtual

This virtual function relates to memory management, and avoiding fragmentation.

It is called only once per model, after we do the first minibatch of training. The default implementation does nothing, but it can be overridden by child classes, where it may re-initialize certain quantities that may possibly have been allocated during the forward pass (e.g. certain statistics; OnlineNaturalGradient objects). We use our own CPU-based allocator (see cu-allocator.h) and since it can't do paging since we're not in control of the GPU page table, fragmentation can be a problem. The allocator always tries to put things in 'low-address memory' (i.e. at smaller memory addresses) near the beginning of the block it allocated, to avoid fragmentation; but if permanent things (belonging to the model) are allocated in the forward pass, they can permanently stay in high memory. This function helps to prevent that, by re-allocating those things into low-address memory (It's important that it's called after all the temporary buffers for the forward-backward have been freed, so that there is low-address memory available)).

Reimplemented from Component.

Definition at line 3062 of file nnet-simple-component.cc.

References NaturalGradientAffineComponent::preconditioner_in_, NaturalGradientAffineComponent::preconditioner_out_, and OnlineNaturalGradient::Swap().

3062  {
3063  OnlineNaturalGradient temp_in(preconditioner_in_);
3064  preconditioner_in_.Swap(&temp_in);
3065  OnlineNaturalGradient temp_out(preconditioner_out_);
3066  preconditioner_out_.Swap(&temp_out);
3067 }
void Swap(OnlineNaturalGradient *other)

◆ Copy()

Component * Copy ( ) const
virtual

Copies component (deep copy).

Reimplemented from AffineComponent.

Definition at line 2983 of file nnet-simple-component.cc.

References NaturalGradientAffineComponent::NaturalGradientAffineComponent().

2983  {
2984  return new NaturalGradientAffineComponent(*this);
2985 }

◆ FreezeNaturalGradient()

void FreezeNaturalGradient ( bool  freeze)
virtual

freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient).

Reimplemented from UpdatableComponent.

Definition at line 3057 of file nnet-simple-component.cc.

References OnlineNaturalGradient::Freeze(), NaturalGradientAffineComponent::preconditioner_in_, and NaturalGradientAffineComponent::preconditioner_out_.

◆ Info()

std::string Info ( ) const
virtual

Returns some text-form information about this component, for diagnostics.

Starts with the type of the component. E.g. "SigmoidComponent dim=900", although most components will have much more info.

Reimplemented from AffineComponent.

Definition at line 2972 of file nnet-simple-component.cc.

References OnlineNaturalGradient::GetAlpha(), OnlineNaturalGradient::GetNumSamplesHistory(), OnlineNaturalGradient::GetRank(), OnlineNaturalGradient::GetUpdatePeriod(), AffineComponent::Info(), NaturalGradientAffineComponent::preconditioner_in_, and NaturalGradientAffineComponent::preconditioner_out_.

2972  {
2973  std::ostringstream stream;
2974  stream << AffineComponent::Info();
2975  stream << ", rank-in=" << preconditioner_in_.GetRank()
2976  << ", rank-out=" << preconditioner_out_.GetRank()
2977  << ", num-samples-history=" << preconditioner_in_.GetNumSamplesHistory()
2978  << ", update-period=" << preconditioner_in_.GetUpdatePeriod()
2979  << ", alpha=" << preconditioner_in_.GetAlpha();
2980  return stream.str();
2981 }
virtual std::string Info() const
Returns some text-form information about this component, for diagnostics.

◆ InitFromConfig()

void InitFromConfig ( ConfigLine cfl)
virtual

Initialize, from a ConfigLine object.

Parameters
[in]cflA ConfigLine containing any parameters that are needed for initialization. For example: "dim=100 param-stddev=0.1"

Reimplemented from AffineComponent.

Definition at line 2867 of file nnet-simple-component.cc.

References AffineComponent::bias_params_, ConfigLine::GetValue(), ConfigLine::HasUnusedValues(), UpdatableComponent::InitLearningRatesFromConfig(), AffineComponent::InputDim(), UpdatableComponent::is_gradient_, KALDI_ASSERT, KALDI_ERR, AffineComponent::linear_params_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), AffineComponent::orthonormal_constraint_, AffineComponent::OutputDim(), NaturalGradientAffineComponent::preconditioner_in_, NaturalGradientAffineComponent::preconditioner_out_, CuMatrixBase< Real >::Range(), kaldi::ReadKaldiObject(), OnlineNaturalGradient::SetAlpha(), OnlineNaturalGradient::SetNumSamplesHistory(), OnlineNaturalGradient::SetRank(), OnlineNaturalGradient::SetUpdatePeriod(), ConfigLine::UnusedValues(), and ConfigLine::WholeLine().

2867  {
2868  bool ok = true;
2869  std::string matrix_filename;
2870 
2871  is_gradient_ = false; // not configurable; there's no reason you'd want this
2872 
2874 
2875  if (cfl->GetValue("matrix", &matrix_filename)) {
2876  CuMatrix<BaseFloat> mat;
2877  ReadKaldiObject(matrix_filename, &mat); // will abort on failure.
2878  KALDI_ASSERT(mat.NumCols() >= 2);
2879  int32 input_dim = mat.NumCols() - 1, output_dim = mat.NumRows();
2880  linear_params_.Resize(output_dim, input_dim);
2881  bias_params_.Resize(output_dim);
2882  linear_params_.CopyFromMat(mat.Range(0, output_dim, 0, input_dim));
2883  bias_params_.CopyColFromMat(mat, input_dim);
2884  if (cfl->GetValue("input-dim", &input_dim))
2885  KALDI_ASSERT(input_dim == InputDim() &&
2886  "input-dim mismatch vs. matrix.");
2887  if (cfl->GetValue("output-dim", &output_dim))
2888  KALDI_ASSERT(output_dim == OutputDim() &&
2889  "output-dim mismatch vs. matrix.");
2890  } else {
2891  int32 input_dim = -1, output_dim = -1;
2892 
2893  ok = ok && cfl->GetValue("input-dim", &input_dim);
2894  ok = ok && cfl->GetValue("output-dim", &output_dim);
2895  if (!ok)
2896  KALDI_ERR << "Bad initializer " << cfl->WholeLine();
2897  BaseFloat param_stddev = 1.0 / std::sqrt(input_dim),
2898  bias_stddev = 1.0, bias_mean = 0.0;
2899  cfl->GetValue("param-stddev", &param_stddev);
2900  cfl->GetValue("bias-stddev", &bias_stddev);
2901  cfl->GetValue("bias-mean", &bias_mean);
2902  linear_params_.Resize(output_dim, input_dim);
2903  bias_params_.Resize(output_dim);
2904  KALDI_ASSERT(output_dim > 0 && input_dim > 0 && param_stddev >= 0.0 &&
2905  bias_stddev >= 0.0);
2906  linear_params_.SetRandn(); // sets to random normally distributed noise.
2907  linear_params_.Scale(param_stddev);
2908  bias_params_.SetRandn();
2909  bias_params_.Scale(bias_stddev);
2910  bias_params_.Add(bias_mean);
2911  }
2912 
2914  cfl->GetValue("orthonormal-constraint", &orthonormal_constraint_);
2915 
2916  // Set natural-gradient configs.
2917  BaseFloat num_samples_history = 2000.0,
2918  alpha = 4.0;
2919  int32 rank_in = -1, rank_out = -1,
2920  update_period = 4;
2921  cfl->GetValue("num-samples-history", &num_samples_history);
2922  cfl->GetValue("alpha", &alpha);
2923  cfl->GetValue("rank-in", &rank_in);
2924  cfl->GetValue("rank-out", &rank_out);
2925  cfl->GetValue("update-period", &update_period);
2926 
2927  if (rank_in < 0)
2928  rank_in = std::min<int32>(20, (InputDim() + 1) / 2);
2929  if (rank_out < 0)
2930  rank_out = std::min<int32>(80, (OutputDim() + 1) / 2);
2931 
2932  preconditioner_in_.SetNumSamplesHistory(num_samples_history);
2933  preconditioner_out_.SetNumSamplesHistory(num_samples_history);
2936  preconditioner_in_.SetRank(rank_in);
2937  preconditioner_out_.SetRank(rank_out);
2938  preconditioner_in_.SetUpdatePeriod(update_period);
2939  preconditioner_out_.SetUpdatePeriod(update_period);
2940 
2941  if (cfl->HasUnusedValues())
2942  KALDI_ERR << "Could not process these elements in initializer: "
2943  << cfl->UnusedValues();
2944  if (!ok)
2945  KALDI_ERR << "Bad initializer " << cfl->WholeLine();
2946 }
void SetNumSamplesHistory(BaseFloat num_samples_history)
void InitLearningRatesFromConfig(ConfigLine *cfl)
kaldi::int32 int32
void ReadKaldiObject(const std::string &filename, Matrix< float > *m)
Definition: kaldi-io.cc:832
float BaseFloat
Definition: kaldi-types.h:29
virtual int32 InputDim() const
Returns input-dimension of this component.
virtual int32 OutputDim() const
Returns output-dimension of this component.
#define KALDI_ERR
Definition: kaldi-error.h:147
CuMatrix< BaseFloat > linear_params_
bool is_gradient_
True if this component is to be treated as a gradient rather than as parameters.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ operator=()

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)
virtual

Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed.

Reimplemented from AffineComponent.

Definition at line 2786 of file nnet-simple-component.cc.

References kaldi::nnet3::ExpectToken(), UpdatableComponent::is_gradient_, KALDI_ERR, kaldi::PeekToken(), kaldi::ReadBasicType(), kaldi::ReadToken(), and UpdatableComponent::ReadUpdatableCommon().

2786  {
2787  ReadUpdatableCommon(is, binary); // Read the opening tag and learning rate
2788  ExpectToken(is, binary, "<LinearParams>");
2789  linear_params_.Read(is, binary);
2790  ExpectToken(is, binary, "<BiasParams>");
2791  bias_params_.Read(is, binary);
2792 
2793  BaseFloat num_samples_history, alpha;
2794  int32 rank_in, rank_out, update_period;
2795 
2796  ExpectToken(is, binary, "<RankIn>");
2797  ReadBasicType(is, binary, &rank_in);
2798  ExpectToken(is, binary, "<RankOut>");
2799  ReadBasicType(is, binary, &rank_out);
2800  if (PeekToken(is, binary) == 'O') {
2801  ExpectToken(is, binary, "<OrthonormalConstraint>");
2802  ReadBasicType(is, binary, &orthonormal_constraint_);
2803  } else {
2805  }
2806  ExpectToken(is, binary, "<UpdatePeriod>");
2807  ReadBasicType(is, binary, &update_period);
2808  ExpectToken(is, binary, "<NumSamplesHistory>");
2809  ReadBasicType(is, binary, &num_samples_history);
2810  ExpectToken(is, binary, "<Alpha>");
2811  ReadBasicType(is, binary, &alpha);
2812 
2813  preconditioner_in_.SetNumSamplesHistory(num_samples_history);
2814  preconditioner_out_.SetNumSamplesHistory(num_samples_history);
2817  preconditioner_in_.SetRank(rank_in);
2818  preconditioner_out_.SetRank(rank_out);
2819  preconditioner_in_.SetUpdatePeriod(update_period);
2820  preconditioner_out_.SetUpdatePeriod(update_period);
2821 
2822  if (PeekToken(is, binary) == 'M') {
2823  // MaxChangePerSample, long ago removed; back compatibility.
2824  ExpectToken(is, binary, "<MaxChangePerSample>");
2825  BaseFloat temp;
2826  ReadBasicType(is, binary, &temp);
2827  }
2828  if (PeekToken(is, binary) == 'I') {
2829  // for back compatibility; we don't write this here any
2830  // more as it's written and read in Write/ReadUpdatableCommon
2831  ExpectToken(is, binary, "<IsGradient>");
2832  ReadBasicType(is, binary, &is_gradient_);
2833  }
2834  if (PeekToken(is, binary) == 'U') {
2835  ExpectToken(is, binary, "<UpdateCount>");
2836  // back-compatibility branch (these configs were added and then removed).
2837  double temp;
2838  ReadBasicType(is, binary, &temp);
2839  ExpectToken(is, binary, "<ActiveScalingCount>");
2840  ReadBasicType(is, binary, &temp);
2841  ExpectToken(is, binary, "<MaxChangeScaleStats>");
2842  ReadBasicType(is, binary, &temp);
2843  }
2844  std::string token;
2845  ReadToken(is, binary, &token);
2846  // the following has to handle a couple variants of
2847  if (token.find("NaturalGradientAffineComponent>") == std::string::npos)
2848  KALDI_ERR << "Expected <NaturalGradientAffineComponent> or "
2849  << "</NaturalGradientAffineComponent>, got " << token;
2850 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void SetNumSamplesHistory(BaseFloat num_samples_history)
kaldi::int32 int32
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
float BaseFloat
Definition: kaldi-types.h:29
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
std::string ReadUpdatableCommon(std::istream &is, bool binary)
#define KALDI_ERR
Definition: kaldi-error.h:147
int PeekToken(std::istream &is, bool binary)
PeekToken will return the first character of the next token, or -1 if end of file.
Definition: io-funcs.cc:170
CuMatrix< BaseFloat > linear_params_
bool is_gradient_
True if this component is to be treated as a gradient rather than as parameters.

◆ Scale()

void Scale ( BaseFloat  scale)
virtual

This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent.

– a Nonlinear component (or another component that stores stats, like BatchNormComponent)– it relates to scaling activation stats, not parameters. Otherwise it will normally do nothing.

Reimplemented from AffineComponent.

Definition at line 3039 of file nnet-simple-component.cc.

References AffineComponent::bias_params_, and AffineComponent::linear_params_.

3039  {
3040  if (scale == 0.0) {
3041  linear_params_.SetZero();
3042  bias_params_.SetZero();
3043  } else {
3044  linear_params_.Scale(scale);
3045  bias_params_.Scale(scale);
3046  }
3047 }
CuMatrix< BaseFloat > linear_params_

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Reimplemented from AffineComponent.

Definition at line 827 of file nnet-simple-component.h.

References PnormComponent::Read(), and PnormComponent::Write().

827 { return "NaturalGradientAffineComponent"; }

◆ Update()

void Update ( const std::string &  debug_info,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv 
)
privatevirtual

Reimplemented from AffineComponent.

Definition at line 2993 of file nnet-simple-component.cc.

References AffineComponent::bias_params_, CuVectorBase< Real >::CopyColFromMat(), kaldi::kNoTrans, kaldi::kTrans, kaldi::kUndefined, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), OnlineNaturalGradient::PreconditionDirections(), NaturalGradientAffineComponent::preconditioner_in_, NaturalGradientAffineComponent::preconditioner_out_, CuMatrixBase< Real >::Range(), and CuMatrix< Real >::Resize().

2996  {
2997  CuMatrix<BaseFloat> in_value_temp;
2998 
2999  in_value_temp.Resize(in_value.NumRows(),
3000  in_value.NumCols() + 1, kUndefined);
3001  in_value_temp.Range(0, in_value.NumRows(),
3002  0, in_value.NumCols()).CopyFromMat(in_value);
3003 
3004  // Add the 1.0 at the end of each row "in_value_temp"
3005  in_value_temp.Range(0, in_value.NumRows(),
3006  in_value.NumCols(), 1).Set(1.0);
3007 
3008  CuMatrix<BaseFloat> out_deriv_temp(out_deriv);
3009 
3010  // These "scale" values get will get multiplied into the learning rate (faster
3011  // than having the matrices scaled inside the preconditioning code).
3012  BaseFloat in_scale, out_scale;
3013 
3014  preconditioner_in_.PreconditionDirections(&in_value_temp, &in_scale);
3015  preconditioner_out_.PreconditionDirections(&out_deriv_temp, &out_scale);
3016 
3017  // "scale" is a scaling factor coming from the PreconditionDirections calls
3018  // (it's faster to have them output a scaling factor than to have them scale
3019  // their outputs).
3020  BaseFloat scale = in_scale * out_scale;
3021 
3022  CuSubMatrix<BaseFloat> in_value_precon_part(in_value_temp,
3023  0, in_value_temp.NumRows(),
3024  0, in_value_temp.NumCols() - 1);
3025  // this "precon_ones" is what happens to the vector of 1's representing
3026  // offsets, after multiplication by the preconditioner.
3027  CuVector<BaseFloat> precon_ones(in_value_temp.NumRows());
3028 
3029  precon_ones.CopyColFromMat(in_value_temp, in_value_temp.NumCols() - 1);
3030 
3031  BaseFloat local_lrate = scale * learning_rate_;
3032 
3033  bias_params_.AddMatVec(local_lrate, out_deriv_temp, kTrans,
3034  precon_ones, 1.0);
3035  linear_params_.AddMatMat(local_lrate, out_deriv_temp, kTrans,
3036  in_value_precon_part, kNoTrans, 1.0);
3037 }
float BaseFloat
Definition: kaldi-types.h:29
BaseFloat learning_rate_
learning rate (typically 0.0..0.01)
void PreconditionDirections(CuMatrixBase< BaseFloat > *X, BaseFloat *scale)
This call implements the main functionality of this class.
CuMatrix< BaseFloat > linear_params_

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Reimplemented from AffineComponent.

Definition at line 2948 of file nnet-simple-component.cc.

References AffineComponent::bias_params_, OnlineNaturalGradient::GetAlpha(), OnlineNaturalGradient::GetNumSamplesHistory(), OnlineNaturalGradient::GetRank(), OnlineNaturalGradient::GetUpdatePeriod(), AffineComponent::linear_params_, AffineComponent::orthonormal_constraint_, NaturalGradientAffineComponent::preconditioner_in_, NaturalGradientAffineComponent::preconditioner_out_, kaldi::WriteBasicType(), kaldi::WriteToken(), and UpdatableComponent::WriteUpdatableCommon().

2949  {
2950  WriteUpdatableCommon(os, binary); // Write the opening tag and learning rate
2951  WriteToken(os, binary, "<LinearParams>");
2952  linear_params_.Write(os, binary);
2953  WriteToken(os, binary, "<BiasParams>");
2954  bias_params_.Write(os, binary);
2955  WriteToken(os, binary, "<RankIn>");
2956  WriteBasicType(os, binary, preconditioner_in_.GetRank());
2957  WriteToken(os, binary, "<RankOut>");
2959  if (orthonormal_constraint_ != 0.0) {
2960  WriteToken(os, binary, "<OrthonormalConstraint>");
2962  }
2963  WriteToken(os, binary, "<UpdatePeriod>");
2965  WriteToken(os, binary, "<NumSamplesHistory>");
2967  WriteToken(os, binary, "<Alpha>");
2969  WriteToken(os, binary, "</NaturalGradientAffineComponent>");
2970 }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
CuMatrix< BaseFloat > linear_params_
void WriteUpdatableCommon(std::ostream &is, bool binary) const
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation

◆ preconditioner_in_

◆ preconditioner_out_


The documentation for this class was generated from the following files: