BlockFactorizedTdnnComponent is a modified form of TdnnComponent (which inherits from TdnnComponent) that is inspired by quaternion-based neural networks, but is more general and trainable– the idea is that blocks of parameters are linear functions of a smaller number parameters, where the linear function itself is trainable. More...

#include <nnet-convolutional-component-temp.h>

Inheritance diagram for BlockFactorizedTdnnComponent:

[legend]

Collaboration diagram for BlockFactorizedTdnnComponent:

[legend]

Private Member Functions
CuMatrixBase< BaseFloat > &	ReducedLinearParams ()

CuMatrixBase< BaseFloat > &	BlockParams ()

int32	InputBlockDim ()

int32	ParamsPerBlock ()

int32	OutputBlockDim ()

int32	NumInputBlocks ()

int32	NumOutputBlocks ()

void	ConvertToIntermediate (const CuMatrixBase< BaseFloat > &linear_params, CuMatrixBase< BaseFloat > *intermediate_params)

void	ConvertToStandard (const CuMatrixBase< BaseFloat > &intermediate_params, CuMatrixBase< BaseFloat > *linear_params)

void DecomposeInterm void	CreateIndexes ()

Private Attributes
CuArray< int32 >	to_standard_indexes_

CuArray< int32 >	to_intermediate_indexes_

CuMatrix< BaseFloat >	reduced_linear_params_

CuMatrix< BaseFloat >	block_params_

Additional Inherited Members
Public Member Functions inherited from TdnnComponent
	TdnnComponent ()

	TdnnComponent (const TdnnComponent &other)

virtual int32	InputDim () const
	Returns input-dimension of this component. More...

virtual int32	OutputDim () const
	Returns output-dimension of this component. More...

virtual std::string	Info () const
	Returns some text-form information about this component, for diagnostics. More...

virtual void	InitFromConfig (ConfigLine *cfl)
	Initialize, from a ConfigLine object. More...

virtual std::string	Type () const
	Returns a string such as "SigmoidComponent", describing the type of the object. More...

virtual int32	Properties () const
	Return bitmask of the component's properties. More...

virtual void *	Propagate (const ComponentPrecomputedIndexes indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > out) const
	Propagate function. More...

virtual void	Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void memo, Component to_update, CuMatrixBase< BaseFloat > in_deriv) const
	Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...

virtual void	Read (std::istream &is, bool binary)
	Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...

virtual void	Write (std::ostream &os, bool binary) const
	Write component to stream. More...

virtual Component *	Copy () const
	Copies component (deep copy). More...

virtual void	ReorderIndexes (std::vector< Index > input_indexes, std::vector< Index > output_indexes) const
	This function only does something interesting for non-simple Components. More...

virtual void	GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
	This function only does something interesting for non-simple Components. More...

virtual bool	IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
	This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...

virtual ComponentPrecomputedIndexes *	PrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
	This function must return NULL for simple Components. More...

virtual void	Scale (BaseFloat scale)
	This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...

virtual void	Add (BaseFloat alpha, const Component &other)
	This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...

virtual void	PerturbParams (BaseFloat stddev)
	This function is to be used in testing. More...

virtual BaseFloat	DotProduct (const UpdatableComponent &other) const
	Computes dot-product between parameters of two instances of a Component. More...

virtual int32	NumParameters () const
	The following new virtual function returns the total dimension of the parameters in this class. More...

virtual void	Vectorize (VectorBase< BaseFloat > *params) const
	Turns the parameters into vector form. More...

virtual void	UnVectorize (const VectorBase< BaseFloat > &params)
	Converts the parameters from vector form. More...

virtual void	FreezeNaturalGradient (bool freeze)
	freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...

CuMatrixBase< BaseFloat > &	LinearParams ()

CuVector< BaseFloat > &	BiasParams ()

BaseFloat	OrthonormalConstraint () const

void	ConsolidateMemory ()
	This virtual function relates to memory management, and avoiding fragmentation. More...

	TdnnComponent ()

	TdnnComponent (const TdnnComponent &other)

virtual int32	InputDim () const
	Returns input-dimension of this component. More...

virtual int32	OutputDim () const
	Returns output-dimension of this component. More...

virtual std::string	Info () const
	Returns some text-form information about this component, for diagnostics. More...

virtual void	InitFromConfig (ConfigLine *cfl)
	Initialize, from a ConfigLine object. More...

virtual std::string	Type () const
	Returns a string such as "SigmoidComponent", describing the type of the object. More...

virtual int32	Properties () const
	Return bitmask of the component's properties. More...

virtual void *	Propagate (const ComponentPrecomputedIndexes indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > out) const
	Propagate function. More...

virtual void	Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void memo, Component to_update, CuMatrixBase< BaseFloat > in_deriv) const
	Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...

virtual void	Read (std::istream &is, bool binary)
	Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...

virtual void	Write (std::ostream &os, bool binary) const
	Write component to stream. More...

virtual Component *	Copy () const
	Copies component (deep copy). More...

virtual void	ReorderIndexes (std::vector< Index > input_indexes, std::vector< Index > output_indexes) const
	This function only does something interesting for non-simple Components. More...

virtual void	GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
	This function only does something interesting for non-simple Components. More...

virtual bool	IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
	This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...

virtual ComponentPrecomputedIndexes *	PrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
	This function must return NULL for simple Components. More...

virtual void	Scale (BaseFloat scale)
	This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...

virtual void	Add (BaseFloat alpha, const Component &other)
	This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...

virtual void	PerturbParams (BaseFloat stddev)
	This function is to be used in testing. More...

virtual BaseFloat	DotProduct (const UpdatableComponent &other) const
	Computes dot-product between parameters of two instances of a Component. More...

virtual int32	NumParameters () const
	The following new virtual function returns the total dimension of the parameters in this class. More...

virtual void	Vectorize (VectorBase< BaseFloat > *params) const
	Turns the parameters into vector form. More...

virtual void	UnVectorize (const VectorBase< BaseFloat > &params)
	Converts the parameters from vector form. More...

virtual void	FreezeNaturalGradient (bool freeze)
	freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...

CuMatrixBase< BaseFloat > &	LinearParams ()

CuVector< BaseFloat > &	BiasParams ()

BaseFloat	OrthonormalConstraint () const

void	ConsolidateMemory ()
	This virtual function relates to memory management, and avoiding fragmentation. More...

Public Member Functions inherited from UpdatableComponent
	UpdatableComponent (const UpdatableComponent &other)

	UpdatableComponent ()

virtual	~UpdatableComponent ()

virtual void	SetUnderlyingLearningRate (BaseFloat lrate)
	Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More...

virtual void	SetActualLearningRate (BaseFloat lrate)
	Sets the learning rate directly, bypassing learning_rate_factor_. More...

virtual void	SetAsGradient ()
	Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More...

virtual BaseFloat	LearningRateFactor ()

virtual void	SetLearningRateFactor (BaseFloat lrate_factor)

void	SetUpdatableConfigs (const UpdatableComponent &other)

BaseFloat	LearningRate () const
	Gets the learning rate to be used in gradient descent. More...

BaseFloat	MaxChange () const
	Returns the per-component max-change value, which is interpreted as the maximum change (in l2 norm) in parameters that is allowed per minibatch for this component. More...

void	SetMaxChange (BaseFloat max_change)

BaseFloat	L2Regularization () const
	Returns the l2 regularization constant, which may be set in any updatable component (usually from the config file). More...

void	SetL2Regularization (BaseFloat a)

Public Member Functions inherited from Component
virtual void	StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
	This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...

virtual void	ZeroStats ()
	Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...

virtual void	DeleteMemo (void *memo) const
	This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...

	Component ()

virtual	~Component ()

Static Public Member Functions inherited from Component
static Component *	ReadNew (std::istream &is, bool binary)
	Read component from stream (works out its type). Dies on error. More...

static Component *	NewComponentOfType (const std::string &type)
	Returns a new Component of the given type e.g. More...

Protected Member Functions inherited from UpdatableComponent
void	InitLearningRatesFromConfig (ConfigLine *cfl)

std::string	ReadUpdatableCommon (std::istream &is, bool binary)

void	WriteUpdatableCommon (std::ostream &is, bool binary) const

Protected Attributes inherited from UpdatableComponent
BaseFloat	learning_rate_
	learning rate (typically 0.0..0.01) More...

BaseFloat	learning_rate_factor_
	learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More...

BaseFloat	l2_regularize_
	L2 regularization constant. More...

bool	is_gradient_
	True if this component is to be treated as a gradient rather than as parameters. More...

BaseFloat	max_change_
	configuration value for imposing max-change More...

Detailed Description

BlockFactorizedTdnnComponent is a modified form of TdnnComponent (which inherits from TdnnComponent) that is inspired by quaternion-based neural networks, but is more general and trainable– the idea is that blocks of parameters are linear functions of a smaller number parameters, where the linear function itself is trainable.

For example, in

Q_{mat} = [ r -x -y -z x r -z y y z r -x z -y x r ]

(where the quaternion is parameterized by r,x,y,z); and we can generalize and say the block has four parameters (r,x,y,z) and we have a 16x4 learnable matrix that maps from those 4 parameters to the matrix entries. In general there doesn't have to be a correspondence between the number of rows and columns of these blocks and the number of parameters per block, i.e. the 4 rows of the block, 4 columns of the block and 4 parameters don't need to all be the same number.f

Please understand that TdnnComponent is basically an affine component that happens to have some advantages in terms of memory efficiency when you are doing splicing over time. You can use it as you would use an AffineComponent, and the block factorization doesn't really interact with that splicing-over-time stuff (or indeed with the bias term). so it might be easier to imagine that we are sub-classing a LinearComponent.

Below we list the config parameters recognized in InitFromConfig(). Many of these are the same as TdnnComponent.

Parameters inherited from UpdatableComponent, via TdnnComponent (see comment above declaration of UpdadableComponent in nnet-component-itf.h for details):

learning-rate, learning-rate-factor, max-change

Parameters inherited from TdnnComponent:

input-dim The input feature dimension (before splicing); the num-cols of linear_params_ is this value times time_offsets_.size().

output-dim The output feature dimension

time-offsets E.g. time-offsets=-1,0,1 or time-offsets=-3,0,3. The time offsets that we require at the input to produce a given output. comparable to the offsets used in TDNNs. They must be unique (no repeats). use-bias Defaults to true, but set to false if you want this to be linear rather than affine in its input.

orthonormal-constraint=0.0 You can set this to -1 to enable a 'floating' orthonormal constraint on both parameters reduced_linear_params_ and block_params_. Will help stability a lot. (Values >0 are also possible but not recommended).

Initialization parameters inherited from TdnnComponent: param-stddev Standard deviation of the linear parameters of the convolution. Defaults to sqrt(1.0 / (input-dim * the number of time-offsets)) TODO: document what is done here. bias-stddev Standard deviation of bias terms. default=0.0. You should not set this if you set use-bias=false.

The natural-gradient related options use-natural-gradient, rank-out, rank-in, num-samples-history are inherited from TdnnComponent; you won't normally have to set these as the defaults are reasonable.

The following options are specific to this sub-class:

output-block-dim The number of rows of each block of linear_params_ (must divide output-dim). Each block, which is a matrix, is obtained as a product of block_params_ with a vector of size params-per-block specific to that block. input-block-dim The number of columns of each block of linear_params_ (must divide input-dim). params-per-block The number of real paramters per block (should probably be substantially less than input-block-dim output-block-dim, or this method doesn't really make sense, but we don't enforce that).

Definition at line 727 of file nnet-convolutional-component-temp.h.