BlockFactorizedTdnnComponent is a modified form of TdnnComponent (which inherits from TdnnComponent) that is inspired by quaternion-based neural networks, but is more general and trainable– the idea is that blocks of parameters are linear functions of a smaller number parameters, where the linear function itself is trainable. More...
#include <nnet-convolutional-component-temp.h>
Private Member Functions | |
CuMatrixBase< BaseFloat > & | ReducedLinearParams () |
CuMatrixBase< BaseFloat > & | BlockParams () |
int32 | InputBlockDim () |
int32 | ParamsPerBlock () |
int32 | OutputBlockDim () |
int32 | NumInputBlocks () |
int32 | NumOutputBlocks () |
void | ConvertToIntermediate (const CuMatrixBase< BaseFloat > &linear_params, CuMatrixBase< BaseFloat > *intermediate_params) |
void | ConvertToStandard (const CuMatrixBase< BaseFloat > &intermediate_params, CuMatrixBase< BaseFloat > *linear_params) |
void DecomposeInterm void | CreateIndexes () |
Private Attributes | |
CuArray< int32 > | to_standard_indexes_ |
CuArray< int32 > | to_intermediate_indexes_ |
CuMatrix< BaseFloat > | reduced_linear_params_ |
CuMatrix< BaseFloat > | block_params_ |
Additional Inherited Members | |
Public Member Functions inherited from TdnnComponent | |
TdnnComponent () | |
TdnnComponent (const TdnnComponent &other) | |
virtual int32 | InputDim () const |
Returns input-dimension of this component. More... | |
virtual int32 | OutputDim () const |
Returns output-dimension of this component. More... | |
virtual std::string | Info () const |
Returns some text-form information about this component, for diagnostics. More... | |
virtual void | InitFromConfig (ConfigLine *cfl) |
Initialize, from a ConfigLine object. More... | |
virtual std::string | Type () const |
Returns a string such as "SigmoidComponent", describing the type of the object. More... | |
virtual int32 | Properties () const |
Return bitmask of the component's properties. More... | |
virtual void * | Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const |
Propagate function. More... | |
virtual void | Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const |
Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More... | |
virtual void | Read (std::istream &is, bool binary) |
Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More... | |
virtual void | Write (std::ostream &os, bool binary) const |
Write component to stream. More... | |
virtual Component * | Copy () const |
Copies component (deep copy). More... | |
virtual void | ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const |
This function only does something interesting for non-simple Components. More... | |
virtual void | GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const |
This function only does something interesting for non-simple Components. More... | |
virtual bool | IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const |
This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More... | |
virtual ComponentPrecomputedIndexes * | PrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const |
This function must return NULL for simple Components. More... | |
virtual void | Scale (BaseFloat scale) |
This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More... | |
virtual void | Add (BaseFloat alpha, const Component &other) |
This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More... | |
virtual void | PerturbParams (BaseFloat stddev) |
This function is to be used in testing. More... | |
virtual BaseFloat | DotProduct (const UpdatableComponent &other) const |
Computes dot-product between parameters of two instances of a Component. More... | |
virtual int32 | NumParameters () const |
The following new virtual function returns the total dimension of the parameters in this class. More... | |
virtual void | Vectorize (VectorBase< BaseFloat > *params) const |
Turns the parameters into vector form. More... | |
virtual void | UnVectorize (const VectorBase< BaseFloat > ¶ms) |
Converts the parameters from vector form. More... | |
virtual void | FreezeNaturalGradient (bool freeze) |
freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More... | |
CuMatrixBase< BaseFloat > & | LinearParams () |
CuVector< BaseFloat > & | BiasParams () |
BaseFloat | OrthonormalConstraint () const |
void | ConsolidateMemory () |
This virtual function relates to memory management, and avoiding fragmentation. More... | |
TdnnComponent () | |
TdnnComponent (const TdnnComponent &other) | |
virtual int32 | InputDim () const |
Returns input-dimension of this component. More... | |
virtual int32 | OutputDim () const |
Returns output-dimension of this component. More... | |
virtual std::string | Info () const |
Returns some text-form information about this component, for diagnostics. More... | |
virtual void | InitFromConfig (ConfigLine *cfl) |
Initialize, from a ConfigLine object. More... | |
virtual std::string | Type () const |
Returns a string such as "SigmoidComponent", describing the type of the object. More... | |
virtual int32 | Properties () const |
Return bitmask of the component's properties. More... | |
virtual void * | Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const |
Propagate function. More... | |
virtual void | Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const |
Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More... | |
virtual void | Read (std::istream &is, bool binary) |
Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More... | |
virtual void | Write (std::ostream &os, bool binary) const |
Write component to stream. More... | |
virtual Component * | Copy () const |
Copies component (deep copy). More... | |
virtual void | ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const |
This function only does something interesting for non-simple Components. More... | |
virtual void | GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const |
This function only does something interesting for non-simple Components. More... | |
virtual bool | IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const |
This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More... | |
virtual ComponentPrecomputedIndexes * | PrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const |
This function must return NULL for simple Components. More... | |
virtual void | Scale (BaseFloat scale) |
This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More... | |
virtual void | Add (BaseFloat alpha, const Component &other) |
This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More... | |
virtual void | PerturbParams (BaseFloat stddev) |
This function is to be used in testing. More... | |
virtual BaseFloat | DotProduct (const UpdatableComponent &other) const |
Computes dot-product between parameters of two instances of a Component. More... | |
virtual int32 | NumParameters () const |
The following new virtual function returns the total dimension of the parameters in this class. More... | |
virtual void | Vectorize (VectorBase< BaseFloat > *params) const |
Turns the parameters into vector form. More... | |
virtual void | UnVectorize (const VectorBase< BaseFloat > ¶ms) |
Converts the parameters from vector form. More... | |
virtual void | FreezeNaturalGradient (bool freeze) |
freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More... | |
CuMatrixBase< BaseFloat > & | LinearParams () |
CuVector< BaseFloat > & | BiasParams () |
BaseFloat | OrthonormalConstraint () const |
void | ConsolidateMemory () |
This virtual function relates to memory management, and avoiding fragmentation. More... | |
Public Member Functions inherited from UpdatableComponent | |
UpdatableComponent (const UpdatableComponent &other) | |
UpdatableComponent () | |
virtual | ~UpdatableComponent () |
virtual void | SetUnderlyingLearningRate (BaseFloat lrate) |
Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More... | |
virtual void | SetActualLearningRate (BaseFloat lrate) |
Sets the learning rate directly, bypassing learning_rate_factor_. More... | |
virtual void | SetAsGradient () |
Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More... | |
virtual BaseFloat | LearningRateFactor () |
virtual void | SetLearningRateFactor (BaseFloat lrate_factor) |
void | SetUpdatableConfigs (const UpdatableComponent &other) |
BaseFloat | LearningRate () const |
Gets the learning rate to be used in gradient descent. More... | |
BaseFloat | MaxChange () const |
Returns the per-component max-change value, which is interpreted as the maximum change (in l2 norm) in parameters that is allowed per minibatch for this component. More... | |
void | SetMaxChange (BaseFloat max_change) |
BaseFloat | L2Regularization () const |
Returns the l2 regularization constant, which may be set in any updatable component (usually from the config file). More... | |
void | SetL2Regularization (BaseFloat a) |
Public Member Functions inherited from Component | |
virtual void | StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo) |
This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More... | |
virtual void | ZeroStats () |
Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More... | |
virtual void | DeleteMemo (void *memo) const |
This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More... | |
Component () | |
virtual | ~Component () |
Static Public Member Functions inherited from Component | |
static Component * | ReadNew (std::istream &is, bool binary) |
Read component from stream (works out its type). Dies on error. More... | |
static Component * | NewComponentOfType (const std::string &type) |
Returns a new Component of the given type e.g. More... | |
Protected Member Functions inherited from UpdatableComponent | |
void | InitLearningRatesFromConfig (ConfigLine *cfl) |
std::string | ReadUpdatableCommon (std::istream &is, bool binary) |
void | WriteUpdatableCommon (std::ostream &is, bool binary) const |
Protected Attributes inherited from UpdatableComponent | |
BaseFloat | learning_rate_ |
learning rate (typically 0.0..0.01) More... | |
BaseFloat | learning_rate_factor_ |
learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More... | |
BaseFloat | l2_regularize_ |
L2 regularization constant. More... | |
bool | is_gradient_ |
True if this component is to be treated as a gradient rather than as parameters. More... | |
BaseFloat | max_change_ |
configuration value for imposing max-change More... | |
BlockFactorizedTdnnComponent is a modified form of TdnnComponent (which inherits from TdnnComponent) that is inspired by quaternion-based neural networks, but is more general and trainable– the idea is that blocks of parameters are linear functions of a smaller number parameters, where the linear function itself is trainable.
For example, in
Q_{mat} = [ r -x -y -z x r -z y y z r -x z -y x r ]
(where the quaternion is parameterized by r,x,y,z); and we can generalize and say the block has four parameters (r,x,y,z) and we have a 16x4 learnable matrix that maps from those 4 parameters to the matrix entries. In general there doesn't have to be a correspondence between the number of rows and columns of these blocks and the number of parameters per block, i.e. the 4 rows of the block, 4 columns of the block and 4 parameters don't need to all be the same number.f
Please understand that TdnnComponent is basically an affine component that happens to have some advantages in terms of memory efficiency when you are doing splicing over time. You can use it as you would use an AffineComponent, and the block factorization doesn't really interact with that splicing-over-time stuff (or indeed with the bias term). so it might be easier to imagine that we are sub-classing a LinearComponent.
Below we list the config parameters recognized in InitFromConfig(). Many of these are the same as TdnnComponent.
Parameters inherited from UpdatableComponent, via TdnnComponent (see comment above declaration of UpdadableComponent in nnet-component-itf.h for details):
learning-rate, learning-rate-factor, max-change
Parameters inherited from TdnnComponent:
input-dim The input feature dimension (before splicing); the num-cols of linear_params_ is this value times time_offsets_.size().
output-dim The output feature dimension
time-offsets E.g. time-offsets=-1,0,1 or time-offsets=-3,0,3. The time offsets that we require at the input to produce a given output. comparable to the offsets used in TDNNs. They must be unique (no repeats). use-bias Defaults to true, but set to false if you want this to be linear rather than affine in its input.
orthonormal-constraint=0.0 You can set this to -1 to enable a 'floating' orthonormal constraint on both parameters reduced_linear_params_ and block_params_. Will help stability a lot. (Values >0 are also possible but not recommended).
Initialization parameters inherited from TdnnComponent: param-stddev Standard deviation of the linear parameters of the convolution. Defaults to sqrt(1.0 / (input-dim * the number of time-offsets)) TODO: document what is done here. bias-stddev Standard deviation of bias terms. default=0.0. You should not set this if you set use-bias=false.
The natural-gradient related options use-natural-gradient, rank-out, rank-in, num-samples-history are inherited from TdnnComponent; you won't normally have to set these as the defaults are reasonable.
The following options are specific to this sub-class:
output-block-dim The number of rows of each block of linear_params_ (must divide output-dim). Each block, which is a matrix, is obtained as a product of block_params_ with a vector of size params-per-block specific to that block. input-block-dim The number of columns of each block of linear_params_ (must divide input-dim). params-per-block The number of real paramters per block (should probably be substantially less than input-block-dim output-block-dim, or this method doesn't really make sense, but we don't enforce that).
Definition at line 727 of file nnet-convolutional-component-temp.h.
|
inlineprivate |
Definition at line 733 of file nnet-convolutional-component-temp.h.
|
private |
|
private |
|
private |
|
inlineprivate |
Definition at line 737 of file nnet-convolutional-component-temp.h.
|
inlineprivate |
Definition at line 743 of file nnet-convolutional-component-temp.h.
|
inlineprivate |
Definition at line 744 of file nnet-convolutional-component-temp.h.
References kaldi::nnet3::time_height_convolution::CreateIndexes().
|
inlineprivate |
Definition at line 742 of file nnet-convolutional-component-temp.h.
|
inlineprivate |
Definition at line 739 of file nnet-convolutional-component-temp.h.
|
inlineprivate |
Definition at line 732 of file nnet-convolutional-component-temp.h.
Definition at line 791 of file nnet-convolutional-component-temp.h.
Definition at line 783 of file nnet-convolutional-component-temp.h.
Definition at line 774 of file nnet-convolutional-component-temp.h.
Definition at line 771 of file nnet-convolutional-component-temp.h.