BlockFactorizedTdnnComponent Class Reference

BlockFactorizedTdnnComponent is a modified form of TdnnComponent (which inherits from TdnnComponent) that is inspired by quaternion-based neural networks, but is more general and trainable– the idea is that blocks of parameters are linear functions of a smaller number parameters, where the linear function itself is trainable. More...

#include <nnet-convolutional-component-temp.h>

Inheritance diagram for BlockFactorizedTdnnComponent:
Collaboration diagram for BlockFactorizedTdnnComponent:

Private Member Functions

CuMatrixBase< BaseFloat > & ReducedLinearParams ()
 
CuMatrixBase< BaseFloat > & BlockParams ()
 
int32 InputBlockDim ()
 
int32 ParamsPerBlock ()
 
int32 OutputBlockDim ()
 
int32 NumInputBlocks ()
 
int32 NumOutputBlocks ()
 
void ConvertToIntermediate (const CuMatrixBase< BaseFloat > &linear_params, CuMatrixBase< BaseFloat > *intermediate_params)
 
void ConvertToStandard (const CuMatrixBase< BaseFloat > &intermediate_params, CuMatrixBase< BaseFloat > *linear_params)
 
void DecomposeInterm void CreateIndexes ()
 

Private Attributes

CuArray< int32to_standard_indexes_
 
CuArray< int32to_intermediate_indexes_
 
CuMatrix< BaseFloatreduced_linear_params_
 
CuMatrix< BaseFloatblock_params_
 

Additional Inherited Members

- Public Member Functions inherited from TdnnComponent
 TdnnComponent ()
 
 TdnnComponent (const TdnnComponent &other)
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void PerturbParams (BaseFloat stddev)
 This function is to be used in testing. More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Computes dot-product between parameters of two instances of a Component. More...
 
virtual int32 NumParameters () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
virtual void FreezeNaturalGradient (bool freeze)
 freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...
 
CuMatrixBase< BaseFloat > & LinearParams ()
 
CuVector< BaseFloat > & BiasParams ()
 
BaseFloat OrthonormalConstraint () const
 
void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
 TdnnComponent ()
 
 TdnnComponent (const TdnnComponent &other)
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void PerturbParams (BaseFloat stddev)
 This function is to be used in testing. More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Computes dot-product between parameters of two instances of a Component. More...
 
virtual int32 NumParameters () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
virtual void FreezeNaturalGradient (bool freeze)
 freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...
 
CuMatrixBase< BaseFloat > & LinearParams ()
 
CuVector< BaseFloat > & BiasParams ()
 
BaseFloat OrthonormalConstraint () const
 
void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
virtual void SetUnderlyingLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More...
 
virtual void SetActualLearningRate (BaseFloat lrate)
 Sets the learning rate directly, bypassing learning_rate_factor_. More...
 
virtual void SetAsGradient ()
 Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More...
 
virtual BaseFloat LearningRateFactor ()
 
virtual void SetLearningRateFactor (BaseFloat lrate_factor)
 
void SetUpdatableConfigs (const UpdatableComponent &other)
 
BaseFloat LearningRate () const
 Gets the learning rate to be used in gradient descent. More...
 
BaseFloat MaxChange () const
 Returns the per-component max-change value, which is interpreted as the maximum change (in l2 norm) in parameters that is allowed per minibatch for this component. More...
 
void SetMaxChange (BaseFloat max_change)
 
BaseFloat L2Regularization () const
 Returns the l2 regularization constant, which may be set in any updatable component (usually from the config file). More...
 
void SetL2Regularization (BaseFloat a)
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 
- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from UpdatableComponent
void InitLearningRatesFromConfig (ConfigLine *cfl)
 
std::string ReadUpdatableCommon (std::istream &is, bool binary)
 
void WriteUpdatableCommon (std::ostream &is, bool binary) const
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (typically 0.0..0.01) More...
 
BaseFloat learning_rate_factor_
 learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More...
 
BaseFloat l2_regularize_
 L2 regularization constant. More...
 
bool is_gradient_
 True if this component is to be treated as a gradient rather than as parameters. More...
 
BaseFloat max_change_
 configuration value for imposing max-change More...
 

Detailed Description

BlockFactorizedTdnnComponent is a modified form of TdnnComponent (which inherits from TdnnComponent) that is inspired by quaternion-based neural networks, but is more general and trainable– the idea is that blocks of parameters are linear functions of a smaller number parameters, where the linear function itself is trainable.

For example, in

Q_{mat} = [ r -x -y -z x r -z y y z r -x z -y x r ]

(where the quaternion is parameterized by r,x,y,z); and we can generalize and say the block has four parameters (r,x,y,z) and we have a 16x4 learnable matrix that maps from those 4 parameters to the matrix entries. In general there doesn't have to be a correspondence between the number of rows and columns of these blocks and the number of parameters per block, i.e. the 4 rows of the block, 4 columns of the block and 4 parameters don't need to all be the same number.f

Please understand that TdnnComponent is basically an affine component that happens to have some advantages in terms of memory efficiency when you are doing splicing over time. You can use it as you would use an AffineComponent, and the block factorization doesn't really interact with that splicing-over-time stuff (or indeed with the bias term). so it might be easier to imagine that we are sub-classing a LinearComponent.

Below we list the config parameters recognized in InitFromConfig(). Many of these are the same as TdnnComponent.

Parameters inherited from UpdatableComponent, via TdnnComponent (see comment above declaration of UpdadableComponent in nnet-component-itf.h for details):

learning-rate, learning-rate-factor, max-change

Parameters inherited from TdnnComponent:

input-dim The input feature dimension (before splicing); the num-cols of linear_params_ is this value times time_offsets_.size().

output-dim The output feature dimension

time-offsets E.g. time-offsets=-1,0,1 or time-offsets=-3,0,3. The time offsets that we require at the input to produce a given output. comparable to the offsets used in TDNNs. They must be unique (no repeats). use-bias Defaults to true, but set to false if you want this to be linear rather than affine in its input.

orthonormal-constraint=0.0 You can set this to -1 to enable a 'floating' orthonormal constraint on both parameters reduced_linear_params_ and block_params_. Will help stability a lot. (Values >0 are also possible but not recommended).

Initialization parameters inherited from TdnnComponent: param-stddev Standard deviation of the linear parameters of the convolution. Defaults to sqrt(1.0 / (input-dim * the number of time-offsets)) TODO: document what is done here. bias-stddev Standard deviation of bias terms. default=0.0. You should not set this if you set use-bias=false.

The natural-gradient related options use-natural-gradient, rank-out, rank-in, num-samples-history are inherited from TdnnComponent; you won't normally have to set these as the defaults are reasonable.

The following options are specific to this sub-class:

output-block-dim The number of rows of each block of linear_params_ (must divide output-dim). Each block, which is a matrix, is obtained as a product of block_params_ with a vector of size params-per-block specific to that block. input-block-dim The number of columns of each block of linear_params_ (must divide input-dim). params-per-block The number of real paramters per block (should probably be substantially less than input-block-dim output-block-dim, or this method doesn't really make sense, but we don't enforce that).

Definition at line 727 of file nnet-convolutional-component-temp.h.

Member Function Documentation

◆ BlockParams()

CuMatrixBase<BaseFloat>& BlockParams ( )
inlineprivate

◆ ConvertToIntermediate()

void ConvertToIntermediate ( const CuMatrixBase< BaseFloat > &  linear_params,
CuMatrixBase< BaseFloat > *  intermediate_params 
)
private

◆ ConvertToStandard()

void ConvertToStandard ( const CuMatrixBase< BaseFloat > &  intermediate_params,
CuMatrixBase< BaseFloat > *  linear_params 
)
private

◆ CreateIndexes()

void DecomposeInterm void CreateIndexes ( )
private

◆ InputBlockDim()

int32 InputBlockDim ( )
inlineprivate

Definition at line 737 of file nnet-convolutional-component-temp.h.

737 { return }

◆ NumInputBlocks()

int32 NumInputBlocks ( )
inlineprivate

Definition at line 743 of file nnet-convolutional-component-temp.h.

743 { return reduced_linear_params_.NumCols(); }

◆ NumOutputBlocks()

int32 NumOutputBlocks ( )
inlineprivate

◆ OutputBlockDim()

int32 OutputBlockDim ( )
inlineprivate

Definition at line 742 of file nnet-convolutional-component-temp.h.

742 { }

◆ ParamsPerBlock()

int32 ParamsPerBlock ( )
inlineprivate

Definition at line 739 of file nnet-convolutional-component-temp.h.

739 { return block_params_.NumCols(); }

◆ ReducedLinearParams()

CuMatrixBase<BaseFloat>& ReducedLinearParams ( )
inlineprivate

Member Data Documentation

◆ block_params_

CuMatrix<BaseFloat> block_params_
private

Definition at line 791 of file nnet-convolutional-component-temp.h.

◆ reduced_linear_params_

CuMatrix<BaseFloat> reduced_linear_params_
private

Definition at line 783 of file nnet-convolutional-component-temp.h.

◆ to_intermediate_indexes_

CuArray<int32> to_intermediate_indexes_
private

Definition at line 774 of file nnet-convolutional-component-temp.h.

◆ to_standard_indexes_

CuArray<int32> to_standard_indexes_
private

Definition at line 771 of file nnet-convolutional-component-temp.h.


The documentation for this class was generated from the following file: