Public Member Functions | |
ModelCollapser (const CollapseModelConfig &config, Nnet *nnet) | |
void | Collapse () |
Private Member Functions | |
int32 | CollapseComponents (int32 component_index1, int32 component_index2) |
This function tries to collapse two successive components, where the component 'component_index1' appears as the input of 'component_index2'. More... | |
int32 | SumDescriptorIsCollapsible (const SumDescriptor &sum_desc) |
int32 | DescriptorIsCollapsible (const Descriptor &desc) |
Descriptor | ReplaceNodeInDescriptor (const Descriptor &src, int32 node_to_replace, const Descriptor &expr) |
bool | OptimizeNode (int32 node_index) |
This function modifies the neural network in the case where 'node_index' is a component-input node whose component (in the node at 'node_index + 1), if a bunch of other conditions also apply. More... | |
int32 | CollapseComponentsDropout (int32 component_index1, int32 component_index2) |
Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'. More... | |
int32 | CollapseComponentsBatchnorm (int32 component_index1, int32 component_index2) |
Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'. More... | |
int32 | CollapseComponentsAffine (int32 component_index1, int32 component_index2) |
Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'. More... | |
int32 | CollapseComponentsScale (int32 component_index1, int32 component_index2) |
Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'. More... | |
int32 | GetDiagonallyPreModifiedComponentIndex (const CuVectorBase< BaseFloat > &offset, const CuVectorBase< BaseFloat > &scale, const std::string &src_identifier, int32 component_index) |
This function finds, or creates, a component which is like 'component_index' but is combined with a diagonal offset-and-scale transform *before* the component. More... | |
int32 | GetScaledComponentIndex (int32 component_index, BaseFloat scale) |
Given a component 'component_index', returns a component which will give the same output as the current component gives when its input is scaled by 'scale'. More... | |
Static Private Member Functions | |
static void | PreMultiplyAffineParameters (const CuVectorBase< BaseFloat > &offset, const CuVectorBase< BaseFloat > &scale, CuVectorBase< BaseFloat > *bias_params, CuMatrixBase< BaseFloat > *linear_params) |
This helper function, used GetDiagonallyPreModifiedComponentIndex, modifies the linear and bias parameters of an affine transform to capture the effect of preceding that affine transform by a diagonal affine transform with parameters 'offset' and 'scale'. More... | |
Private Attributes | |
const CollapseModelConfig & | config_ |
Nnet * | nnet_ |
Definition at line 1447 of file nnet-utils.cc.
|
inline |
Definition at line 1449 of file nnet-utils.cc.
|
inline |
Definition at line 1452 of file nnet-utils.cc.
References KALDI_ERR, KALDI_LOG, rnnlm::n, SvdApplier::nnet_, Nnet::NumComponents(), Nnet::NumNodes(), Nnet::RemoveOrphanComponents(), and Nnet::RemoveOrphanNodes().
Referenced by kaldi::nnet3::CollapseModel().
This function tries to collapse two successive components, where the component 'component_index1' appears as the input of 'component_index2'.
If the two components can be collapsed in that way, it returns the index of a combined component.
Note: in addition to the two components simply being chained together, this function supports the case where different time-offsets of the first component are appendend together as the input of the second component. So the input-dim of the second component may be a multiple of the output-dim of the first component.
The function returns the component-index of a (newly created or existing) component that combines both of these components, if it's possible to combine them; or it returns -1 if it's not possible.
Definition at line 1493 of file nnet-utils.cc.
Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'.
This handles the case where 'component_index1' is of type FixedAffineComponent, AffineComponent or NaturalGradientAffineComponent, and 'component_index2' is of type AffineComponent or NaturalGradientAffineComponent.
Returns -1 if this code can't produce a combined component.
Definition at line 1761 of file nnet-utils.cc.
References Nnet::AddComponent(), CuMatrixBase< Real >::AddMatMat(), CuVectorBase< Real >::AddMatVec(), AffineComponent::BiasParams(), FixedAffineComponent::BiasParams(), Nnet::GetComponent(), Nnet::GetComponentIndex(), Nnet::GetComponentName(), rnnlm::i, AffineComponent::Init(), AffineComponent::InputDim(), FixedAffineComponent::InputDim(), KALDI_ASSERT, kaldi::kNoTrans, AffineComponent::LinearParams(), FixedAffineComponent::LinearParams(), SvdApplier::nnet_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), AffineComponent::OutputDim(), FixedAffineComponent::OutputDim(), CuVectorBase< Real >::Range(), CuMatrixBase< Real >::Range(), and AffineComponent::SetParams().
Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'.
This handles the case where 'component_index1' is of type BatchnormComponent, and where 'component_index2' is of type AffineComponent or NaturalGradientAffineComponent.
Returns -1 if this code can't produce a combined component (normally because the components have the wrong types).
Definition at line 1733 of file nnet-utils.cc.
References Nnet::GetComponent(), Nnet::GetComponentName(), KALDI_ERR, SvdApplier::nnet_, BatchNormComponent::Offset(), and BatchNormComponent::Scale().
Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'.
This handles the case where 'component_index1' is of type DropoutComponent or GeneralDropoutComponent, and where 'component_index2' is of type AffineComponent, NaturalGradientAffineComponent, LinearComponent, TdnnComponent or TimeHeightConvolutionComponent.
Returns -1 if this code can't produce a combined component (normally because the components have the wrong types).
Definition at line 1693 of file nnet-utils.cc.
References DropoutComponent::DropoutProportion(), Nnet::GetComponent(), and SvdApplier::nnet_.
Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'.
This handles the case where 'component_index1' is of type AffineComponent or NaturalGradientAffineComponent, and 'component_index2' is of type FixedScaleComponent, and the output dim of the first is the same as the input dim of the second. This situation is common in output layers. Later if it's needed, we could easily enable the code to support PerElementScaleComponent.
Returns -1 if this code can't produce a combined component.
Definition at line 1860 of file nnet-utils.cc.
References Nnet::AddComponent(), AffineComponent::BiasParams(), AffineComponent::Copy(), Nnet::GetComponent(), Nnet::GetComponentIndex(), Nnet::GetComponentName(), FixedScaleComponent::InputDim(), AffineComponent::LinearParams(), SvdApplier::nnet_, AffineComponent::OutputDim(), FixedScaleComponent::Scales(), and AffineComponent::SetParams().
|
inlineprivate |
Definition at line 1553 of file nnet-utils.cc.
References rnnlm::i, Descriptor::NumParts(), and Descriptor::Part().
|
inlineprivate |
This function finds, or creates, a component which is like 'component_index' but is combined with a diagonal offset-and-scale transform *before* the component.
(We may later create a function called GetDiagonallyPostModifiedComponentIndex if we need to apply the transform *after* the component.
This function doesn't work for convolutional components, because due to zero-padding, it's not possible to represent an offset/scale on the input filters via changes in the convolutional parameters. [the scale, yes; but we don't bother doing that.]
This may require modifying its linear and bias parameters.
[in] | offset | The offset term 'b' in the diagnonal transform y = a x + b. |
[in] | scale | The scale term 'a' in the diagnonal transform y = a x + b. Must have the same dimension as 'offset'. |
[in] | src_identifier | A string that uniquely identifies 'offset' and 'scale'. In practice it will be the component-index from where 'offset' and 'scale' were taken. |
[in] | component_index | The component to be modified (not in-place, but as a copy). The component described in 'component_index' must be AffineComponent, NaturalGradientAffineComponent, LinearComponent or TdnnComponent, and the dimension of 'offset'/'scale' should divide the component input dimension, otherwise it's an error. |
Definition at line 1934 of file nnet-utils.cc.
References Nnet::AddComponent(), AffineComponent::BiasParams(), TdnnComponent::BiasParams(), SvdApplier::ModifiedComponentInfo::component_index, Component::Copy(), TdnnComponent::Copy(), CuVectorBase< Real >::Dim(), Nnet::GetComponent(), Nnet::GetComponentIndex(), Nnet::GetComponentName(), KALDI_ASSERT, UpdatableComponent::LearningRate(), AffineComponent::LinearParams(), TdnnComponent::LinearParams(), CuVectorBase< Real >::Max(), CuVectorBase< Real >::Min(), SvdApplier::nnet_, TdnnComponent::OutputDim(), LinearComponent::OutputDim(), and LinearComponent::Params().
Given a component 'component_index', returns a component which will give the same output as the current component gives when its input is scaled by 'scale'.
This will generally mean applying the scale to the linear parameters in the component, if it is an affine, linear or convolutional component.
If the component referred to in 'component_index' is not an affine or convolutional component, and therefore cannot be scaled (by this code), then this function returns -1.
Definition at line 2050 of file nnet-utils.cc.
References Nnet::AddComponent(), SvdApplier::ModifiedComponentInfo::component_index, Component::Copy(), Nnet::GetComponent(), Nnet::GetComponentIndex(), Nnet::GetComponentName(), KALDI_ASSERT, SvdApplier::nnet_, AffineComponent::Scale(), TdnnComponent::Scale(), and LinearComponent::Scale().
This function modifies the neural network in the case where 'node_index' is a component-input node whose component (in the node at 'node_index + 1), if a bunch of other conditions also apply.
First, he descriptor in the node at 'node_index' has to have a certain limited structure, e.g.:
ALSO the components need to be collapsible by the function CollapseComponents(), which will only be possible for certain pairs of component types (like, say, a dropout node preceding an affine or convolutional node); see that function for details.
This function will (if it does anything), modify the node to replace the component at 'node_index + 1' with a newly created component that combines the two components involved. It will also modify the node at 'node_index' by replacing its Descriptor with a modified input descriptor, so that if the input-descriptor of node 'foo' was 'bar', the descriptor for our node would now look like: 'Append(Offset(bar, -3), bar, Offset(bar, 3))'... and note that 'bar' itself doesn't have to be just a node-name, it can be a more general expression. This function returns true if it changed something in the neural net, and false otherwise.
Definition at line 1638 of file nnet-utils.cc.
References NetworkNode::component_index, SvdApplier::ModifiedComponentInfo::component_index, NetworkNode::descriptor, Nnet::GetNode(), kaldi::nnet3::kComponent, kaldi::nnet3::kDescriptor, SvdApplier::nnet_, NetworkNode::node_type, Nnet::NumNodes(), and NetworkNode::u.
|
inlinestaticprivate |
This helper function, used GetDiagonallyPreModifiedComponentIndex, modifies the linear and bias parameters of an affine transform to capture the effect of preceding that affine transform by a diagonal affine transform with parameters 'offset' and 'scale'.
The dimension of 'offset' and 'scale' must be the same and must divide the input dim of the affine transform, i.e. must divide linear_params->NumCols().
Definition at line 2006 of file nnet-utils.cc.
References CuVectorBase< Real >::AddMatVec(), rnnlm::d, CuVectorBase< Real >::Dim(), KALDI_ASSERT, kaldi::kNoTrans, CuMatrixBase< Real >::MulColsVec(), CuMatrixBase< Real >::NumCols(), and CuMatrixBase< Real >::NumRows().
|
inlineprivate |
Definition at line 1572 of file nnet-utils.cc.
References kaldi::nnet3::DescriptorTokenize(), Nnet::GetNodeNames(), KALDI_ASSERT, SvdApplier::nnet_, Descriptor::Parse(), and Descriptor::WriteConfig().
|
inlineprivate |
Definition at line 1525 of file nnet-utils.cc.
References SimpleForwardingDescriptor::GetNodeDependencies(), OffsetForwardingDescriptor::Src(), and SimpleSumDescriptor::Src().
|
private |
Definition at line 2095 of file nnet-utils.cc.
|
private |
Definition at line 2096 of file nnet-utils.cc.