Collaboration diagram for ModelCollapser:

Public Member Functions
	ModelCollapser (const CollapseModelConfig &config, Nnet *nnet)

void	Collapse ()

Private Member Functions
int32	CollapseComponents (int32 component_index1, int32 component_index2)
	This function tries to collapse two successive components, where the component 'component_index1' appears as the input of 'component_index2'. More...

int32	SumDescriptorIsCollapsible (const SumDescriptor &sum_desc)

int32	DescriptorIsCollapsible (const Descriptor &desc)

Descriptor	ReplaceNodeInDescriptor (const Descriptor &src, int32 node_to_replace, const Descriptor &expr)

bool	OptimizeNode (int32 node_index)
	This function modifies the neural network in the case where 'node_index' is a component-input node whose component (in the node at 'node_index + 1), if a bunch of other conditions also apply. More...

int32	CollapseComponentsDropout (int32 component_index1, int32 component_index2)
	Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'. More...

int32	CollapseComponentsBatchnorm (int32 component_index1, int32 component_index2)
	Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'. More...

int32	CollapseComponentsAffine (int32 component_index1, int32 component_index2)
	Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'. More...

int32	CollapseComponentsScale (int32 component_index1, int32 component_index2)
	Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'. More...

int32	GetDiagonallyPreModifiedComponentIndex (const CuVectorBase< BaseFloat > &offset, const CuVectorBase< BaseFloat > &scale, const std::string &src_identifier, int32 component_index)
	This function finds, or creates, a component which is like 'component_index' but is combined with a diagonal offset-and-scale transform before the component. More...

int32	GetScaledComponentIndex (int32 component_index, BaseFloat scale)
	Given a component 'component_index', returns a component which will give the same output as the current component gives when its input is scaled by 'scale'. More...

Static Private Member Functions
static void	PreMultiplyAffineParameters (const CuVectorBase< BaseFloat > &offset, const CuVectorBase< BaseFloat > &scale, CuVectorBase< BaseFloat > bias_params, CuMatrixBase< BaseFloat > linear_params)
	This helper function, used GetDiagonallyPreModifiedComponentIndex, modifies the linear and bias parameters of an affine transform to capture the effect of preceding that affine transform by a diagonal affine transform with parameters 'offset' and 'scale'. More...

Private Attributes
const CollapseModelConfig &	config_

Nnet *	nnet_

Detailed Description

Definition at line 1447 of file nnet-utils.cc.

Constructor & Destructor Documentation

◆ ModelCollapser()

ModelCollapser	(	const CollapseModelConfig &	config,
		Nnet *	nnet
	)

inline

Definition at line 1449 of file nnet-utils.cc.

1450 :

1451 config_(config), nnet_(nnet) { }

kaldi::nnet3::ModelCollapser::config_

const CollapseModelConfig & config_

Definition: nnet-utils.cc:2095

kaldi::nnet3::ModelCollapser::nnet_

Nnet * nnet_

Definition: nnet-utils.cc:2096

Member Function Documentation

◆ Collapse()

void Collapse ( )

inline

Definition at line 1452 of file nnet-utils.cc.

References KALDI_ERR, KALDI_LOG, rnnlm::n, SvdApplier::nnet_, Nnet::NumComponents(), Nnet::NumNodes(), Nnet::RemoveOrphanComponents(), and Nnet::RemoveOrphanNodes().

Referenced by kaldi::nnet3::CollapseModel().

                   {
     bool changed = true;
     int32 num_nodes = nnet_->NumNodes(),
         num_iters = 0;
     int32 num_components1 = nnet_->NumComponents();
     for (; changed; num_iters++) {
       changed = false;
       for (int32 n = 0; n < num_nodes; n++)
         if (OptimizeNode(n))
           changed = true;
       // we shouldn't iterate more than a couple of times.
       if (num_iters >= 10)
         KALDI_ERR << "Something went wrong collapsing model.";
     }
     int32 num_components2 = nnet_->NumComponents();
     nnet_->RemoveOrphanNodes();
     nnet_->RemoveOrphanComponents();
     int32 num_components3 = nnet_->NumComponents();
     if (num_components2 != num_components1 ||
         num_components3 != num_components2)
       KALDI_LOG << "Added " << (num_components2 - num_components1)
                 << " components, removed "
                 << (num_components2 - num_components3);
   }

◆ CollapseComponents()

int32 CollapseComponents	(	int32	component_index1,
		int32	component_index2
	)

inlineprivate

This function tries to collapse two successive components, where the component 'component_index1' appears as the input of 'component_index2'.

If the two components can be collapsed in that way, it returns the index of a combined component.

Note: in addition to the two components simply being chained together, this function supports the case where different time-offsets of the first component are appendend together as the input of the second component. So the input-dim of the second component may be a multiple of the output-dim of the first component.

The function returns the component-index of a (newly created or existing) component that combines both of these components, if it's possible to combine them; or it returns -1 if it's not possible.

Definition at line 1493 of file nnet-utils.cc.

                                                    {
     int32 ans;
     if (config_.collapse_dropout &&
         (ans = CollapseComponentsDropout(component_index1,
                                          component_index2)) != -1)
       return ans;
     if (config_.collapse_batchnorm &&
         (ans = CollapseComponentsBatchnorm(component_index1,
                                            component_index2)) != -1)
       return ans;
     if (config_.collapse_affine &&
         (ans = CollapseComponentsAffine(component_index1,
                                         component_index2)) != -1)
       return ans;
     if (config_.collapse_scale &&
         (ans = CollapseComponentsScale(component_index1,
                                        component_index2)) != -1)
       return ans;
     return -1;
   }

◆ CollapseComponentsAffine()

int32 CollapseComponentsAffine	(	int32	component_index1,
		int32	component_index2
	)

inlineprivate

Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'.

This handles the case where 'component_index1' is of type FixedAffineComponent, AffineComponent or NaturalGradientAffineComponent, and 'component_index2' is of type AffineComponent or NaturalGradientAffineComponent.

Returns -1 if this code can't produce a combined component.

Definition at line 1761 of file nnet-utils.cc.

References Nnet::AddComponent(), CuMatrixBase< Real >::AddMatMat(), CuVectorBase< Real >::AddMatVec(), AffineComponent::BiasParams(), FixedAffineComponent::BiasParams(), Nnet::GetComponent(), Nnet::GetComponentIndex(), Nnet::GetComponentName(), rnnlm::i, AffineComponent::Init(), AffineComponent::InputDim(), FixedAffineComponent::InputDim(), KALDI_ASSERT, kaldi::kNoTrans, AffineComponent::LinearParams(), FixedAffineComponent::LinearParams(), SvdApplier::nnet_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), AffineComponent::OutputDim(), FixedAffineComponent::OutputDim(), CuVectorBase< Real >::Range(), CuMatrixBase< Real >::Range(), and AffineComponent::SetParams().

                                                          {
 
     const FixedAffineComponent *fixed_affine_component1 =
         dynamic_cast<const FixedAffineComponent*>(
             nnet_->GetComponent(component_index1));
     const AffineComponent *affine_component1 =
         dynamic_cast<const AffineComponent*>(
             nnet_->GetComponent(component_index1)),
         *affine_component2 =
         dynamic_cast<const AffineComponent*>(
             nnet_->GetComponent(component_index2));
     if (affine_component2 == NULL ||
         (fixed_affine_component1 == NULL && affine_component1 == NULL))
       return -1;
 
     std::ostringstream new_component_name_os;
     new_component_name_os << nnet_->GetComponentName(component_index1)
                           << "." << nnet_->GetComponentName(component_index2);
     std::string new_component_name = new_component_name_os.str();
     int32 new_component_index = nnet_->GetComponentIndex(new_component_name);
     if (new_component_index >= 0)
       return new_component_index;  // we previously created this.
 
     const CuMatrix<BaseFloat> *linear_params1;
     const CuVector<BaseFloat> *bias_params1;
     if (fixed_affine_component1 != NULL) {
       if (fixed_affine_component1->InputDim() >
           fixed_affine_component1->OutputDim()) {
         // first affine component is dimension-reducing, so combining the two
         // might be inefficient.
         return -1;
       }
       linear_params1 = &(fixed_affine_component1->LinearParams());
       bias_params1 = &(fixed_affine_component1->BiasParams());
     } else {
       if (affine_component1->InputDim() >
           affine_component1->OutputDim()) {
         // first affine component is dimension-reducing, so combining the two
         // might be inefficient.
         return -1;
       }
       linear_params1 = &(affine_component1->LinearParams());
       bias_params1 = &(affine_component1->BiasParams());
     }
 
     int32 input_dim1 = linear_params1->NumCols(),
         output_dim1 = linear_params1->NumRows(),
         input_dim2 = affine_component2->InputDim(),
         output_dim2 = affine_component2->OutputDim();
     KALDI_ASSERT(input_dim2 % output_dim1 == 0);
     // with typical configurations for TDNNs, like Append(-3, 0, 3) [in xconfigs], a.k.a.
     // Append(Offset(foo, -3), foo, Offset(foo, 3)), the first component's output may
     // be smaller than the second component's input.  We construct a single
     // transform with a block-diagonal structure in this case.
     int32 multiple = input_dim2 / output_dim1;
     CuVector<BaseFloat> bias_params1_full(input_dim2);
     CuMatrix<BaseFloat> linear_params1_full(input_dim2,
                                             multiple * input_dim1);
     for (int32 i = 0; i < multiple; i++) {
       bias_params1_full.Range(i * output_dim1,
                               output_dim1).CopyFromVec(*bias_params1);
       linear_params1_full.Range(i * output_dim1, output_dim1,
                                 i * input_dim1, input_dim1).CopyFromMat(
                                     *linear_params1);
     }
     const CuVector<BaseFloat> &bias_params2 = affine_component2->BiasParams();
     const CuMatrix<BaseFloat> &linear_params2 = affine_component2->LinearParams();
 
     int32 new_input_dim = multiple * input_dim1,
         new_output_dim = output_dim2;
     CuMatrix<BaseFloat> new_linear_params(new_output_dim,
                                           new_input_dim);
     CuVector<BaseFloat> new_bias_params(bias_params2);
     new_bias_params.AddMatVec(1.0, linear_params2, kNoTrans,
                               bias_params1_full, 1.0);
     new_linear_params.AddMatMat(1.0, linear_params2, kNoTrans,
                                 linear_params1_full, kNoTrans, 0.0);
 
     AffineComponent *new_component = new AffineComponent();
     new_component->Init(new_input_dim, new_output_dim, 0.0, 0.0);
     new_component->SetParams(new_bias_params, new_linear_params);
     return nnet_->AddComponent(new_component_name, new_component);
   }

◆ CollapseComponentsBatchnorm()

int32 CollapseComponentsBatchnorm	(	int32	component_index1,
		int32	component_index2
	)

inlineprivate

Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'.

This handles the case where 'component_index1' is of type BatchnormComponent, and where 'component_index2' is of type AffineComponent or NaturalGradientAffineComponent.

Returns -1 if this code can't produce a combined component (normally because the components have the wrong types).

Definition at line 1733 of file nnet-utils.cc.

References Nnet::GetComponent(), Nnet::GetComponentName(), KALDI_ERR, SvdApplier::nnet_, BatchNormComponent::Offset(), and BatchNormComponent::Scale().

                                                             {
     const BatchNormComponent *batchnorm_component =
         dynamic_cast<const BatchNormComponent*>(
             nnet_->GetComponent(component_index1));
     if (batchnorm_component == NULL)
       return -1;
 
     if (batchnorm_component->Offset().Dim() == 0) {
       KALDI_ERR << "Expected batch-norm components to have test-mode set.";
     }
     std::string batchnorm_component_name = nnet_->GetComponentName(
         component_index1);
     return GetDiagonallyPreModifiedComponentIndex(batchnorm_component->Offset(),
                                                   batchnorm_component->Scale(),
                                                   batchnorm_component_name,
                                                   component_index2);
   }

◆ CollapseComponentsDropout()

int32 CollapseComponentsDropout	(	int32	component_index1,
		int32	component_index2
	)

inlineprivate

Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'.

This handles the case where 'component_index1' is of type DropoutComponent or GeneralDropoutComponent, and where 'component_index2' is of type AffineComponent, NaturalGradientAffineComponent, LinearComponent, TdnnComponent or TimeHeightConvolutionComponent.

Returns -1 if this code can't produce a combined component (normally because the components have the wrong types).

Definition at line 1693 of file nnet-utils.cc.

References DropoutComponent::DropoutProportion(), Nnet::GetComponent(), and SvdApplier::nnet_.

                                                           {
     const DropoutComponent *dropout_component =
         dynamic_cast<const DropoutComponent*>(
             nnet_->GetComponent(component_index1));
     const GeneralDropoutComponent *general_dropout_component =
         dynamic_cast<const GeneralDropoutComponent*>(
             nnet_->GetComponent(component_index1));
 
     if (dropout_component == NULL && general_dropout_component == NULL)
       return -1;
     BaseFloat scale;  // the scale we have to apply to correct for removing
                       // this dropout comonent.
     if (dropout_component != NULL) {
       BaseFloat dropout_proportion = dropout_component->DropoutProportion();
       scale = 1.0 / (1.0 - dropout_proportion);
     } else {
       // for GeneralDropoutComponent, it's done in such a way that the expectation
       // is always 1.  (When it's nonzero, we give it a value 1/(1-dropout_proportion).
       // So no scaling is needed.
       scale = 1.0;
     }
     // note: if the 2nd component is not of a type that we can scale, the
     // following function call will return -1, which is OK.
     return GetScaledComponentIndex(component_index2,
                                    scale);
   }

◆ CollapseComponentsScale()

int32 CollapseComponentsScale	(	int32	component_index1,
		int32	component_index2
	)

inlineprivate

Tries to produce a component that's equivalent to running the component 'component_index2' with input given by 'component_index1'.

This handles the case where 'component_index1' is of type AffineComponent or NaturalGradientAffineComponent, and 'component_index2' is of type FixedScaleComponent, and the output dim of the first is the same as the input dim of the second. This situation is common in output layers. Later if it's needed, we could easily enable the code to support PerElementScaleComponent.

Returns -1 if this code can't produce a combined component.

Definition at line 1860 of file nnet-utils.cc.

References Nnet::AddComponent(), AffineComponent::BiasParams(), AffineComponent::Copy(), Nnet::GetComponent(), Nnet::GetComponentIndex(), Nnet::GetComponentName(), FixedScaleComponent::InputDim(), AffineComponent::LinearParams(), SvdApplier::nnet_, AffineComponent::OutputDim(), FixedScaleComponent::Scales(), and AffineComponent::SetParams().

                                                         {
 
     const AffineComponent *affine_component1 =
         dynamic_cast<const AffineComponent*>(
             nnet_->GetComponent(component_index1));
     const FixedScaleComponent *fixed_scale_component2 =
         dynamic_cast<const FixedScaleComponent*>(
                     nnet_->GetComponent(component_index2));
     if (affine_component1 == NULL ||
         fixed_scale_component2 == NULL ||
         affine_component1->OutputDim() !=
         fixed_scale_component2->InputDim())
       return -1;
 
     std::ostringstream new_component_name_os;
     new_component_name_os << nnet_->GetComponentName(component_index1)
                           << "." << nnet_->GetComponentName(component_index2);
     std::string new_component_name = new_component_name_os.str();
     int32 new_component_index = nnet_->GetComponentIndex(new_component_name);
     if (new_component_index >= 0)
       return new_component_index;  // we previously created this.
 
     CuMatrix<BaseFloat> linear_params(affine_component1->LinearParams());
     CuVector<BaseFloat> bias_params(affine_component1->BiasParams());
     const CuVector<BaseFloat> &scales = fixed_scale_component2->Scales();
 
     bias_params.MulElements(scales);
     linear_params.MulRowsVec(scales);
 
     AffineComponent *new_affine_component =
         dynamic_cast<AffineComponent*>(affine_component1->Copy());
     new_affine_component->SetParams(bias_params, linear_params);
     return nnet_->AddComponent(new_component_name,
                                new_affine_component);
   }

◆ DescriptorIsCollapsible()

int32 DescriptorIsCollapsible ( const Descriptor & desc )

inlineprivate

Definition at line 1553 of file nnet-utils.cc.

References rnnlm::i, Descriptor::NumParts(), and Descriptor::Part().

                                                         {
     int32 ans = SumDescriptorIsCollapsible(desc.Part(0));
     for (int32 i = 1; i < desc.NumParts(); i++) {
       if (ans != -1) {
         int32 node_index = SumDescriptorIsCollapsible(desc.Part(i));
         if (node_index != ans)
           ans = -1;
       }
     }
     // note: ans is only >= 0 if the answers from all parts of
     // the SumDescriptors were >=0 and identical to each other.
     // Otherwise it will be -1.
     return ans;
   }

◆ GetDiagonallyPreModifiedComponentIndex()

int32 GetDiagonallyPreModifiedComponentIndex	(	const CuVectorBase< BaseFloat > &	offset,
		const CuVectorBase< BaseFloat > &	scale,
		const std::string &	src_identifier,
		int32	component_index
	)

inlineprivate

This function finds, or creates, a component which is like 'component_index' but is combined with a diagonal offset-and-scale transform *before* the component.

(We may later create a function called GetDiagonallyPostModifiedComponentIndex if we need to apply the transform *after* the component.

This function doesn't work for convolutional components, because due to zero-padding, it's not possible to represent an offset/scale on the input filters via changes in the convolutional parameters. [the scale, yes; but we don't bother doing that.]

This may require modifying its linear and bias parameters.

Parameters

[in]	offset	The offset term 'b' in the diagnonal transform y = a x + b.
[in]	scale	The scale term 'a' in the diagnonal transform y = a x + b. Must have the same dimension as 'offset'.
[in]	src_identifier	A string that uniquely identifies 'offset' and 'scale'. In practice it will be the component-index from where 'offset' and 'scale' were taken.
[in]	component_index	The component to be modified (not in-place, but as a copy). The component described in 'component_index' must be AffineComponent, NaturalGradientAffineComponent, LinearComponent or TdnnComponent, and the dimension of 'offset'/'scale' should divide the component input dimension, otherwise it's an error.

Returns: Returns the component-index of a suitably modified component. If one like this already exists, the existing one will be returned. If the component in 'component_index' was not of a type that can be modified in this way, returns -1.

Definition at line 1934 of file nnet-utils.cc.

References Nnet::AddComponent(), AffineComponent::BiasParams(), TdnnComponent::BiasParams(), SvdApplier::ModifiedComponentInfo::component_index, Component::Copy(), TdnnComponent::Copy(), CuVectorBase< Real >::Dim(), Nnet::GetComponent(), Nnet::GetComponentIndex(), Nnet::GetComponentName(), KALDI_ASSERT, UpdatableComponent::LearningRate(), AffineComponent::LinearParams(), TdnnComponent::LinearParams(), CuVectorBase< Real >::Max(), CuVectorBase< Real >::Min(), SvdApplier::nnet_, TdnnComponent::OutputDim(), LinearComponent::OutputDim(), and LinearComponent::Params().

                              {
     KALDI_ASSERT(offset.Dim() > 0 && offset.Dim() == scale.Dim());
     if (offset.Max() == 0.0 && offset.Min() == 0.0 &&
         scale.Max() == 1.0 && scale.Min() == 1.0)
       return component_index;  // identity transform.
     std::ostringstream new_component_name_os;
     new_component_name_os << src_identifier
                           << "."
                           << nnet_->GetComponentName(component_index);
     std::string new_component_name = new_component_name_os.str();
     int32 new_component_index = nnet_->GetComponentIndex(new_component_name);
     if (new_component_index >= 0)
       return new_component_index;  // we previously created this.
 
     const Component *component = nnet_->GetComponent(component_index);
     const AffineComponent *affine_component =
         dynamic_cast<const AffineComponent*>(component);
     const LinearComponent *linear_component =
         dynamic_cast<const LinearComponent*>(component);
     const TdnnComponent *tdnn_component =
         dynamic_cast<const TdnnComponent*>(component);
 
     Component *new_component = NULL;
     if (affine_component != NULL) {
       new_component = component->Copy();
       AffineComponent *new_affine_component =
           dynamic_cast<AffineComponent*>(new_component);
       PreMultiplyAffineParameters(offset, scale,
                                   &(new_affine_component->BiasParams()),
                                   &(new_affine_component->LinearParams()));
     } else if (linear_component != NULL) {
       CuVector<BaseFloat> bias_params(linear_component->OutputDim());
       AffineComponent *new_affine_component =
           new AffineComponent(linear_component->Params(),
                               bias_params,
                               linear_component->LearningRate());
       PreMultiplyAffineParameters(offset, scale,
                                   &(new_affine_component->BiasParams()),
                                   &(new_affine_component->LinearParams()));
       new_component = new_affine_component;
     } else if (tdnn_component != NULL) {
       new_component = tdnn_component->Copy();
       TdnnComponent *new_tdnn_component =
           dynamic_cast<TdnnComponent*>(new_component);
       if (new_tdnn_component->BiasParams().Dim() == 0) {
         // make sure it has a bias even if it had none before.
         new_tdnn_component->BiasParams().Resize(
             new_tdnn_component->OutputDim());
       }
       PreMultiplyAffineParameters(offset, scale,
                                   &(new_tdnn_component->BiasParams()),
                                   &(new_tdnn_component->LinearParams()));
 
     } else {
       return -1;  // we can't do this: this component isn't of the right type.
     }
     return nnet_->AddComponent(new_component_name, new_component);
   }

◆ GetScaledComponentIndex()

int32 GetScaledComponentIndex	(	int32	component_index,
		BaseFloat	scale
	)

inlineprivate

Given a component 'component_index', returns a component which will give the same output as the current component gives when its input is scaled by 'scale'.

This will generally mean applying the scale to the linear parameters in the component, if it is an affine, linear or convolutional component.

If the component referred to in 'component_index' is not an affine or convolutional component, and therefore cannot be scaled (by this code), then this function returns -1.

Definition at line 2050 of file nnet-utils.cc.

References Nnet::AddComponent(), SvdApplier::ModifiedComponentInfo::component_index, Component::Copy(), Nnet::GetComponent(), Nnet::GetComponentIndex(), Nnet::GetComponentName(), KALDI_ASSERT, SvdApplier::nnet_, AffineComponent::Scale(), TdnnComponent::Scale(), and LinearComponent::Scale().

                                                  {
     if (scale == 1.0)
       return component_index;
     std::ostringstream os;
     os << nnet_->GetComponentName(component_index)
        << ".scale" << std::setprecision(3) << scale;
     std::string new_component_name = os.str();  // e.g. foo.s2.0
     int32 ans = nnet_->GetComponentIndex(new_component_name);
     if (ans >= 0)
       return ans;  // one already exists, no need to create it.
     const Component *current_component = nnet_->GetComponent(component_index);
     const AffineComponent *affine_component =
         dynamic_cast<const AffineComponent*>(current_component);
     const TimeHeightConvolutionComponent *conv_component =
         dynamic_cast<const TimeHeightConvolutionComponent*>(current_component);
     const LinearComponent *linear_component =
         dynamic_cast<const LinearComponent*>(current_component);
     const TdnnComponent *tdnn_component =
         dynamic_cast<const TdnnComponent*>(current_component);
 
     if (affine_component == NULL && conv_component == NULL &&
         linear_component == NULL && tdnn_component == NULL) {
       // We can't scale this component (at least, not using this code).
       return -1;
     }
 
     Component *new_component = current_component->Copy();
 
     if (affine_component != NULL) {
       // AffineComponent or NaturalGradientAffineComponent.
       dynamic_cast<AffineComponent*>(new_component)->
           LinearParams().Scale(scale);
     } else if (conv_component != NULL) {
       dynamic_cast<TimeHeightConvolutionComponent*>(new_component)->
           ScaleLinearParams(scale);
     } else if (linear_component != NULL) {
       dynamic_cast<LinearComponent*>(new_component)->Params().Scale(scale);
     } else {
       KALDI_ASSERT(tdnn_component != NULL);
       dynamic_cast<TdnnComponent*>(new_component)->LinearParams().Scale(scale);
     }
     return nnet_->AddComponent(new_component_name, new_component);
   }

◆ OptimizeNode()

bool OptimizeNode ( int32 node_index )

inlineprivate

This function modifies the neural network in the case where 'node_index' is a component-input node whose component (in the node at 'node_index + 1), if a bunch of other conditions also apply.

First, he descriptor in the node at 'node_index' has to have a certain limited structure, e.g.:

the input-descriptor is a component-node name like 'foo' or:
the input-descriptor is a combination of Append and/or and Offset expressions, like: 'Append(Offset(foo, -3), foo, Offset(foo, 3))', referring to only a single node 'foo'.

ALSO the components need to be collapsible by the function CollapseComponents(), which will only be possible for certain pairs of component types (like, say, a dropout node preceding an affine or convolutional node); see that function for details.

This function will (if it does anything), modify the node to replace the component at 'node_index + 1' with a newly created component that combines the two components involved. It will also modify the node at 'node_index' by replacing its Descriptor with a modified input descriptor, so that if the input-descriptor of node 'foo' was 'bar', the descriptor for our node would now look like: 'Append(Offset(bar, -3), bar, Offset(bar, 3))'... and note that 'bar' itself doesn't have to be just a node-name, it can be a more general expression. This function returns true if it changed something in the neural net, and false otherwise.

Definition at line 1638 of file nnet-utils.cc.

References NetworkNode::component_index, SvdApplier::ModifiedComponentInfo::component_index, NetworkNode::descriptor, Nnet::GetNode(), kaldi::nnet3::kComponent, kaldi::nnet3::kDescriptor, SvdApplier::nnet_, NetworkNode::node_type, Nnet::NumNodes(), and NetworkNode::u.

                                       {
     NetworkNode &descriptor_node = nnet_->GetNode(node_index);
     if (descriptor_node.node_type != kDescriptor ||
         node_index + 1 >= nnet_->NumNodes())
       return false;
     NetworkNode &component_node = nnet_->GetNode(node_index + 1);
     if (component_node.node_type != kComponent)
       return false;
     Descriptor &descriptor = descriptor_node.descriptor;
     int32 component_index = component_node.u.component_index;
 
     int32 input_node_index = DescriptorIsCollapsible(descriptor);
     if (input_node_index == -1)
       return false;  // do nothing, the expression in the Descriptor is too
                      // general for this code to handle.
     const NetworkNode &input_node = nnet_->GetNode(input_node_index);
     if (input_node.node_type != kComponent)
       return false;
     int32 input_component_index = input_node.u.component_index;
     int32 combined_component_index = CollapseComponents(input_component_index,
                                                         component_index);
     if (combined_component_index == -1)
       return false;  // these components were not of types that can be
                      // collapsed.
     component_node.u.component_index = combined_component_index;
 
     // 'input_descriptor_node' is the input descriptor of the component
     // that's the input to the node in "node_index".  (e.g. the component for
     // the node "foo" in our example above).
     const NetworkNode &input_descriptor_node = nnet_->GetNode(input_node_index - 1);
     const Descriptor &input_descriptor = input_descriptor_node.descriptor;
 
     // The next statement replaces the descriptor in the network node with one
     // in which the component 'input_component_index' has been replaced with its
     // input, thus bypassing the component in 'input_component_index'.
     // We'll later remove that component and its node from the network, if
     // needed by RemoveOrphanNodes() and RemoveOrphanComponents().
     descriptor = ReplaceNodeInDescriptor(descriptor,
                                          input_node_index,
                                          input_descriptor);
     return true;
   }

◆ PreMultiplyAffineParameters()

static void PreMultiplyAffineParameters	(	const CuVectorBase< BaseFloat > &	offset,
		const CuVectorBase< BaseFloat > &	scale,
		CuVectorBase< BaseFloat > *	bias_params,
		CuMatrixBase< BaseFloat > *	linear_params
	)

inlinestaticprivate

This helper function, used GetDiagonallyPreModifiedComponentIndex, modifies the linear and bias parameters of an affine transform to capture the effect of preceding that affine transform by a diagonal affine transform with parameters 'offset' and 'scale'.

The dimension of 'offset' and 'scale' must be the same and must divide the input dim of the affine transform, i.e. must divide linear_params->NumCols().

Definition at line 2006 of file nnet-utils.cc.

References CuVectorBase< Real >::AddMatVec(), rnnlm::d, CuVectorBase< Real >::Dim(), KALDI_ASSERT, kaldi::kNoTrans, CuMatrixBase< Real >::MulColsVec(), CuMatrixBase< Real >::NumCols(), and CuMatrixBase< Real >::NumRows().

                                               {
     int32 input_dim = linear_params->NumCols(),
         transform_dim = offset.Dim();
     KALDI_ASSERT(bias_params->Dim() == linear_params->NumRows() &&
                  offset.Dim() == scale.Dim() &&
                  input_dim % transform_dim == 0);
     // we may have to repeat 'offset' and scale' several times.
     // 'full_offset' and 'full_scale' may be repeated versions of
     // 'offset' and 'scale' in case input_dim > transform_dim.
     CuVector<BaseFloat> full_offset(input_dim),
         full_scale(input_dim);
     for (int32 d = 0; d < input_dim; d += transform_dim) {
       full_offset.Range(d, transform_dim).CopyFromVec(offset);
       full_scale.Range(d, transform_dim).CopyFromVec(scale);
     }
 
     // Image the affine component does y = a x + b, and by applying
     // the pre-transform we are replacing x with s x + o
     // s for scale and o for offset), so we have:
     //  y = a s x + (b + a o).
     // do: b += a o.
     bias_params->AddMatVec(1.0, *linear_params, kNoTrans, full_offset, 1.0);
     // do: a = a * s.
     linear_params->MulColsVec(full_scale);
 
 
   }

◆ ReplaceNodeInDescriptor()

Descriptor ReplaceNodeInDescriptor	(	const Descriptor &	src,
		int32	node_to_replace,
		const Descriptor &	expr
	)

inlineprivate

Definition at line 1572 of file nnet-utils.cc.

References kaldi::nnet3::DescriptorTokenize(), Nnet::GetNodeNames(), KALDI_ASSERT, SvdApplier::nnet_, Descriptor::Parse(), and Descriptor::WriteConfig().

                                                              {
     // The way we replace it is at the textual level: we create a "fake" vector
     // of node-names where the printed form of 'expr' appears as the
     // node name in node_names[node_to_replace]; we print the descriptor
     // in 'src' using that faked node-names vector; and we parse it again
     // using the real node-names vector.
     std::vector<std::string> node_names = nnet_->GetNodeNames();
     std::ostringstream expr_os;
     expr.WriteConfig(expr_os, node_names);
     node_names[node_to_replace] = expr_os.str();
     std::ostringstream src_replaced_os;
     src.WriteConfig(src_replaced_os, node_names);
     std::vector<std::string> tokens;
     // now, in the example, src_replaced_os.str() would equal
     //  Append(Offset(Offset(bar, -1), -1), Offset(Offset(bar, -1), 1)).
     bool b = DescriptorTokenize(src_replaced_os.str(),
                                   &tokens);
     KALDI_ASSERT(b);
     // 'tokens' might now contain something like [ "Append", "(", "Offset", ..., ")" ].
     tokens.push_back("end of input");
     const std::string *next_token = &(tokens[0]);
     Descriptor ans;
     // parse using the un-modified node names.
     ans.Parse(nnet_->GetNodeNames(), &next_token);
     KALDI_ASSERT(*next_token == "end of input");
     // Note: normalization of expressions in Descriptors, such as conversion of
     // Offset(Offset(bar, -1), -1) to Offset(bar, -2), takes place inside the
     // Descriptor parsing code.
     return ans;
   }

◆ SumDescriptorIsCollapsible()

int32 SumDescriptorIsCollapsible ( const SumDescriptor & sum_desc )

inlineprivate

Definition at line 1525 of file nnet-utils.cc.

References SimpleForwardingDescriptor::GetNodeDependencies(), OffsetForwardingDescriptor::Src(), and SimpleSumDescriptor::Src().

                                                                   {
     // I don't much like having to use dynamic_cast here.
     const SimpleSumDescriptor *ss = dynamic_cast<const SimpleSumDescriptor*>(
         &sum_desc);
     if (ss == NULL) return -1;
     const ForwardingDescriptor *fd = &(ss->Src());
     const OffsetForwardingDescriptor *od =
         dynamic_cast<const OffsetForwardingDescriptor*>(fd);
     if (od != NULL)
       fd = &(od->Src());
     const SimpleForwardingDescriptor *sd =
         dynamic_cast<const SimpleForwardingDescriptor*>(fd);
     if (sd == NULL) return -1;
     else {
       // the following is a rather roundabout way to get the node-index from a
       // SimpleForwardingDescriptor, but it works (it avoids adding other stuff
       // to the interface).
       std::vector<int32> v;
       sd->GetNodeDependencies(&v);
       int32 node_index = v[0];
       return node_index;
     }
   }

Member Data Documentation

◆ config_

const CollapseModelConfig& config_

private

Definition at line 2095 of file nnet-utils.cc.

◆ nnet_

Nnet* nnet_

private

Definition at line 2096 of file nnet-utils.cc.

The documentation for this class was generated from the following file:

nnet3/nnet-utils.cc

Public Member Functions

Private Member Functions

Static Private Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

◆ ModelCollapser()

Member Function Documentation

◆ Collapse()

◆ CollapseComponents()

◆ CollapseComponentsAffine()

◆ CollapseComponentsBatchnorm()

◆ CollapseComponentsDropout()

◆ CollapseComponentsScale()

◆ DescriptorIsCollapsible()

◆ GetDiagonallyPreModifiedComponentIndex()

◆ GetScaledComponentIndex()

◆ OptimizeNode()

◆ PreMultiplyAffineParameters()

◆ ReplaceNodeInDescriptor()

◆ SumDescriptorIsCollapsible()

Member Data Documentation

◆ config_

◆ nnet_