NaturalGradientPerElementScaleComponent Class Reference

NaturalGradientPerElementScaleComponent is like PerElementScaleComponent but it uses a natural gradient update for the per-element scales. More...

#include <nnet-simple-component.h>

Inheritance diagram for NaturalGradientPerElementScaleComponent:
Collaboration diagram for NaturalGradientPerElementScaleComponent:

Public Member Functions

virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
 NaturalGradientPerElementScaleComponent ()
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual void FreezeNaturalGradient (bool freeze)
 freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
 NaturalGradientPerElementScaleComponent (const NaturalGradientPerElementScaleComponent &other)
 
void Init (int32 dim, BaseFloat param_mean, BaseFloat param_stddev, int32 rank, int32 update_period, BaseFloat num_samples_history, BaseFloat alpha)
 
void Init (std::string vector_filename, int32 rank, int32 update_period, BaseFloat num_samples_history, BaseFloat alpha)
 
void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
- Public Member Functions inherited from PerElementScaleComponent
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
 PerElementScaleComponent ()
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void PerturbParams (BaseFloat stddev)
 This function is to be used in testing. More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Computes dot-product between parameters of two instances of a Component. More...
 
virtual int32 NumParameters () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
 PerElementScaleComponent (const PerElementScaleComponent &other)
 
void Init (int32 dim, BaseFloat param_mean, BaseFloat param_stddev)
 
void Init (std::string vector_filename)
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
virtual void SetUnderlyingLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More...
 
virtual void SetActualLearningRate (BaseFloat lrate)
 Sets the learning rate directly, bypassing learning_rate_factor_. More...
 
virtual void SetAsGradient ()
 Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More...
 
virtual BaseFloat LearningRateFactor ()
 
virtual void SetLearningRateFactor (BaseFloat lrate_factor)
 
void SetUpdatableConfigs (const UpdatableComponent &other)
 
BaseFloat LearningRate () const
 Gets the learning rate to be used in gradient descent. More...
 
BaseFloat MaxChange () const
 Returns the per-component max-change value, which is interpreted as the maximum change (in l2 norm) in parameters that is allowed per minibatch for this component. More...
 
void SetMaxChange (BaseFloat max_change)
 
BaseFloat L2Regularization () const
 Returns the l2 regularization constant, which may be set in any updatable component (usually from the config file). More...
 
void SetL2Regularization (BaseFloat a)
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 

Private Member Functions

virtual void Update (const std::string &debug_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
const NaturalGradientPerElementScaleComponentoperator= (const NaturalGradientPerElementScaleComponent &other)
 

Private Attributes

OnlineNaturalGradient preconditioner_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from PerElementScaleComponent
virtual void UpdateSimple (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
const PerElementScaleComponentoperator= (const PerElementScaleComponent &other)
 
- Protected Member Functions inherited from UpdatableComponent
void InitLearningRatesFromConfig (ConfigLine *cfl)
 
std::string ReadUpdatableCommon (std::istream &is, bool binary)
 
void WriteUpdatableCommon (std::ostream &is, bool binary) const
 
- Protected Attributes inherited from PerElementScaleComponent
CuVector< BaseFloatscales_
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (typically 0.0..0.01) More...
 
BaseFloat learning_rate_factor_
 learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More...
 
BaseFloat l2_regularize_
 L2 regularization constant. More...
 
bool is_gradient_
 True if this component is to be treated as a gradient rather than as parameters. More...
 
BaseFloat max_change_
 configuration value for imposing max-change More...
 

Detailed Description

NaturalGradientPerElementScaleComponent is like PerElementScaleComponent but it uses a natural gradient update for the per-element scales.

Accepted values on its config line, with defaults if applicable:

vector If specified, the offsets will be read from this file ('vector' is interpreted as an rxfilename).

dim The dimension that this component inputs and outputs. Only required if 'vector' is not specified.

param-mean=1.0 Mean of randomly initialized offset parameters; should only be supplied if 'vector' is not supplied. param-stddev=0.0 Standard deviation of randomly initialized offset parameters; should only be supplied if 'vector' is not supplied.

And the natural-gradient-related configuration values: rank=8 update-period=10 num-samples-history=2000.0 alpha=4.0

Definition at line 1766 of file nnet-simple-component.h.

Constructor & Destructor Documentation

◆ NaturalGradientPerElementScaleComponent() [1/2]

Definition at line 1773 of file nnet-simple-component.h.

Referenced by NaturalGradientPerElementScaleComponent::Copy().

1773 { } // use Init to really initialize.

◆ NaturalGradientPerElementScaleComponent() [2/2]

Member Function Documentation

◆ ConsolidateMemory()

void ConsolidateMemory ( )
virtual

This virtual function relates to memory management, and avoiding fragmentation.

It is called only once per model, after we do the first minibatch of training. The default implementation does nothing, but it can be overridden by child classes, where it may re-initialize certain quantities that may possibly have been allocated during the forward pass (e.g. certain statistics; OnlineNaturalGradient objects). We use our own CPU-based allocator (see cu-allocator.h) and since it can't do paging since we're not in control of the GPU page table, fragmentation can be a problem. The allocator always tries to put things in 'low-address memory' (i.e. at smaller memory addresses) near the beginning of the block it allocated, to avoid fragmentation; but if permanent things (belonging to the model) are allocated in the forward pass, they can permanently stay in high memory. This function helps to prevent that, by re-allocating those things into low-address memory (It's important that it's called after all the temporary buffers for the forward-backward have been freed, so that there is low-address memory available)).

Reimplemented from Component.

Definition at line 3962 of file nnet-simple-component.cc.

References NaturalGradientPerElementScaleComponent::preconditioner_, and OnlineNaturalGradient::Swap().

3962  {
3963  OnlineNaturalGradient temp(preconditioner_);
3964  preconditioner_.Swap(&temp);
3965 }
void Swap(OnlineNaturalGradient *other)

◆ Copy()

Component * Copy ( ) const
virtual

◆ FreezeNaturalGradient()

void FreezeNaturalGradient ( bool  freeze)
virtual

freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient).

Reimplemented from UpdatableComponent.

Definition at line 3958 of file nnet-simple-component.cc.

References OnlineNaturalGradient::Freeze(), and NaturalGradientPerElementScaleComponent::preconditioner_.

◆ Info()

std::string Info ( ) const
virtual

Returns some text-form information about this component, for diagnostics.

Starts with the type of the component. E.g. "SigmoidComponent dim=900", although most components will have much more info.

Reimplemented from PerElementScaleComponent.

Definition at line 3858 of file nnet-simple-component.cc.

References PerElementScaleComponent::Info().

Referenced by CompositeComponent::Info().

3858  {
3859  std::ostringstream stream;
3860  stream << PerElementScaleComponent::Info()
3861  << ", rank=" << preconditioner_.GetRank()
3862  << ", update-period=" << preconditioner_.GetUpdatePeriod()
3863  << ", num-samples-history=" << preconditioner_.GetNumSamplesHistory()
3864  << ", alpha=" << preconditioner_.GetAlpha();
3865  return stream.str();
3866 }
virtual std::string Info() const
Returns some text-form information about this component, for diagnostics.

◆ Init() [1/2]

void Init ( int32  dim,
BaseFloat  param_mean,
BaseFloat  param_stddev,
int32  rank,
int32  update_period,
BaseFloat  num_samples_history,
BaseFloat  alpha 
)

Definition at line 3904 of file nnet-simple-component.cc.

References PerElementScaleComponent::Init().

Referenced by PermuteComponent::InitFromConfig(), CompositeComponent::InitFromConfig(), and CompositeComponent::Read().

3907  {
3908  PerElementScaleComponent::Init(dim, param_mean,
3909  param_stddev);
3910  preconditioner_.SetRank(rank);
3911  preconditioner_.SetUpdatePeriod(update_period);
3912  preconditioner_.SetNumSamplesHistory(num_samples_history);
3913  preconditioner_.SetAlpha(alpha);
3914 }
void SetNumSamplesHistory(BaseFloat num_samples_history)
void Init(int32 dim, BaseFloat param_mean, BaseFloat param_stddev)

◆ Init() [2/2]

void Init ( std::string  vector_filename,
int32  rank,
int32  update_period,
BaseFloat  num_samples_history,
BaseFloat  alpha 
)

Definition at line 3916 of file nnet-simple-component.cc.

References PerElementScaleComponent::Init().

3919  {
3920  PerElementScaleComponent::Init(vector_filename);
3921  preconditioner_.SetRank(rank);
3922  preconditioner_.SetUpdatePeriod(update_period);
3923  preconditioner_.SetNumSamplesHistory(num_samples_history);
3924  preconditioner_.SetAlpha(alpha);
3925 }
void SetNumSamplesHistory(BaseFloat num_samples_history)
void Init(int32 dim, BaseFloat param_mean, BaseFloat param_stddev)

◆ InitFromConfig()

void InitFromConfig ( ConfigLine cfl)
virtual

Initialize, from a ConfigLine object.

Parameters
[in]cflA ConfigLine containing any parameters that are needed for initialization. For example: "dim=100 param-stddev=0.1"

Reimplemented from PerElementScaleComponent.

Definition at line 3868 of file nnet-simple-component.cc.

References ConfigLine::GetValue(), ConfigLine::HasUnusedValues(), FixedAffineComponent::Init(), KALDI_ASSERT, KALDI_ERR, FixedAffineComponent::Type(), and ConfigLine::WholeLine().

3868  {
3869  // First set various configuration values that have defaults.
3870  int32 rank = 8, // Use a small rank because in this case the amount of memory
3871  // for the preconditioner actually exceeds the memory for the
3872  // parameters (by "rank").
3873  update_period = 10;
3874  BaseFloat num_samples_history = 2000.0, alpha = 4.0;
3875  cfl->GetValue("rank", &rank);
3876  cfl->GetValue("update-period", &update_period);
3877  cfl->GetValue("num-samples-history", &num_samples_history);
3878  cfl->GetValue("alpha", &alpha);
3880  std::string filename;
3881  // Accepts "scales" config (for filename) or "dim" -> random init, for testing.
3882  if (cfl->GetValue("scales", &filename)) {
3883  if (cfl->HasUnusedValues())
3884  KALDI_ERR << "Invalid initializer for layer of type "
3885  << Type() << ": \"" << cfl->WholeLine() << "\"";
3886  Init(filename, rank, update_period, num_samples_history, alpha);
3887 
3888  } else {
3889  BaseFloat param_mean = 1.0, param_stddev = 0.0;
3890  cfl->GetValue("param-mean", &param_mean);
3891  cfl->GetValue("param-stddev", &param_stddev);
3892 
3893  int32 dim;
3894  if (!cfl->GetValue("dim", &dim) || cfl->HasUnusedValues())
3895  KALDI_ERR << "Invalid initializer for layer of type "
3896  << Type() << ": \"" << cfl->WholeLine() << "\"";
3897  KALDI_ASSERT(dim > 0);
3898 
3899  Init(dim, param_mean, param_stddev, rank, update_period,
3900  num_samples_history, alpha);
3901  }
3902 }
void Init(int32 dim, BaseFloat param_mean, BaseFloat param_stddev, int32 rank, int32 update_period, BaseFloat num_samples_history, BaseFloat alpha)
void InitLearningRatesFromConfig(ConfigLine *cfl)
virtual std::string Type() const
Returns a string such as "SigmoidComponent", describing the type of the object.
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ operator=()

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)
virtual

Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed.

Reimplemented from PerElementScaleComponent.

Definition at line 3807 of file nnet-simple-component.cc.

References kaldi::nnet3::ExpectToken(), KALDI_ASSERT, kaldi::ReadBasicType(), and kaldi::ReadToken().

3808  {
3809  ReadUpdatableCommon(is, binary); // Read the opening tag and learning rate
3810  ExpectToken(is, binary, "<Params>");
3811  scales_.Read(is, binary);
3812  ExpectToken(is, binary, "<IsGradient>");
3813  ReadBasicType(is, binary, &is_gradient_);
3814  int32 rank, update_period;
3815  ExpectToken(is, binary, "<Rank>");
3816  ReadBasicType(is, binary, &rank);
3817  preconditioner_.SetRank(rank);
3818  ExpectToken(is, binary, "<UpdatePeriod>");
3819  ReadBasicType(is, binary, &update_period);
3820  preconditioner_.SetUpdatePeriod(update_period);
3821  BaseFloat num_samples_history, alpha;
3822  ExpectToken(is, binary, "<NumSamplesHistory>");
3823  ReadBasicType(is, binary, &num_samples_history);
3824  preconditioner_.SetNumSamplesHistory(num_samples_history);
3825  ExpectToken(is, binary, "<Alpha>");
3826  ReadBasicType(is, binary, &alpha);
3827  preconditioner_.SetAlpha(alpha);
3828  std::string token;
3829  ReadToken(is, binary, &token);
3830  if (token == "<MaxChangePerMinibatch>") {
3831  // back compatibility; this was removed, it's now handled by the
3832  // 'max-change' config variable.
3833  BaseFloat temp;
3834  ReadBasicType(is, binary, &temp);
3835  ReadToken(is, binary, &token);
3836  }
3837  KALDI_ASSERT(token == "</NaturalGradientPerElementScaleComponent>");
3838 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void SetNumSamplesHistory(BaseFloat num_samples_history)
kaldi::int32 int32
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
float BaseFloat
Definition: kaldi-types.h:29
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
std::string ReadUpdatableCommon(std::istream &is, bool binary)
bool is_gradient_
True if this component is to be treated as a gradient rather than as parameters.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Reimplemented from PerElementScaleComponent.

Definition at line 1774 of file nnet-simple-component.h.

References Component::ConsolidateMemory(), PnormComponent::Copy(), kaldi::nnet3::FreezeNaturalGradient(), PnormComponent::Init(), PnormComponent::Read(), and PnormComponent::Write().

Referenced by PermuteComponent::Info(), CompositeComponent::Info(), and PermuteComponent::InitFromConfig().

1774  {
1775  return "NaturalGradientPerElementScaleComponent";
1776  }

◆ Update()

void Update ( const std::string &  debug_info,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv 
)
privatevirtual

Reimplemented from PerElementScaleComponent.

Definition at line 3940 of file nnet-simple-component.cc.

References CuVectorBase< Real >::AddRowSumMat(), UpdatableComponent::learning_rate_, CuMatrixBase< Real >::MulElements(), OnlineNaturalGradient::PreconditionDirections(), NaturalGradientPerElementScaleComponent::preconditioner_, and PerElementScaleComponent::scales_.

3943  {
3944 
3945  CuMatrix<BaseFloat> derivs_per_frame(in_value);
3946  derivs_per_frame.MulElements(out_deriv);
3947  // the non-natural-gradient update would just do
3948  // scales_.AddRowSumMat(learning_rate_, derivs_per_frame).
3949 
3950  BaseFloat scale;
3951  preconditioner_.PreconditionDirections(&derivs_per_frame, &scale);
3952 
3953  CuVector<BaseFloat> delta_scales(scales_.Dim());
3954  delta_scales.AddRowSumMat(scale * learning_rate_, derivs_per_frame);
3955  scales_.AddVec(1.0, delta_scales);
3956 }
float BaseFloat
Definition: kaldi-types.h:29
BaseFloat learning_rate_
learning rate (typically 0.0..0.01)
void PreconditionDirections(CuMatrixBase< BaseFloat > *X, BaseFloat *scale)
This call implements the main functionality of this class.

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Reimplemented from PerElementScaleComponent.

Definition at line 3840 of file nnet-simple-component.cc.

References kaldi::WriteBasicType(), and kaldi::WriteToken().

Referenced by CompositeComponent::Write().

3841  {
3842  WriteUpdatableCommon(os, binary); // Write the opening tag and learning rate
3843  WriteToken(os, binary, "<Params>");
3844  scales_.Write(os, binary);
3845  WriteToken(os, binary, "<IsGradient>");
3846  WriteBasicType(os, binary, is_gradient_);
3847  WriteToken(os, binary, "<Rank>");
3848  WriteBasicType(os, binary, preconditioner_.GetRank());
3849  WriteToken(os, binary, "<UpdatePeriod>");
3851  WriteToken(os, binary, "<NumSamplesHistory>");
3853  WriteToken(os, binary, "<Alpha>");
3854  WriteBasicType(os, binary, preconditioner_.GetAlpha());
3855  WriteToken(os, binary, "</NaturalGradientPerElementScaleComponent>");
3856 }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
bool is_gradient_
True if this component is to be treated as a gradient rather than as parameters.
void WriteUpdatableCommon(std::ostream &is, bool binary) const
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation

◆ preconditioner_


The documentation for this class was generated from the following files: