ConvolutionalComponent implements convolution over single axis (i.e. More...

#include <nnet-convolutional-component.h>

Inheritance diagram for ConvolutionalComponent:

Collaboration diagram for ConvolutionalComponent:

[legend]

Public Member Functions
	ConvolutionalComponent (int32 dim_in, int32 dim_out)

	~ConvolutionalComponent ()

Component *	Copy () const
	Copy component (deep copy),. More...

ComponentType	GetType () const
	Get Type Identification of the component,. More...

void	InitData (std::istream &is)
	Initialize the content of the component by the 'line' from the prototype,. More...

void	ReadData (std::istream &is, bool binary)
	Reads the component content. More...

void	WriteData (std::ostream &os, bool binary) const
	Writes the component content. More...

int32	NumParams () const
	Number of trainable parameters,. More...

void	GetGradient (VectorBase< BaseFloat > *gradient) const
	Get gradient reshaped as a vector,. More...

void	GetParams (VectorBase< BaseFloat > *params) const
	Get the trainable parameters reshaped as a vector,. More...

void	SetParams (const VectorBase< BaseFloat > &params)
	Set the trainable parameters from, reshaped as a vector,. More...

std::string	Info () const
	Print some additional info (after <ComponentName> and the dims),. More...

std::string	InfoGradient () const
	Print some additional info about gradient (after <...> and dims),. More...

void	PropagateFnc (const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out)
	Abstract interface for propagation/backpropagation. More...

void	ReverseIndexes (const std::vector< int32 > &forward_indexes, std::vector< std::vector< int32 > > *backward_indexes)

void	RearrangeIndexes (const std::vector< std::vector< int32 > > &in, std::vector< std::vector< int32 > > *out)

void	BackpropagateFnc (const CuMatrixBase< BaseFloat > &in, const CuMatrixBase< BaseFloat > &out, const CuMatrixBase< BaseFloat > &out_diff, CuMatrixBase< BaseFloat > *in_diff)
	Backward pass transformation (to be implemented by descending class...) More...

void	Update (const CuMatrixBase< BaseFloat > &input, const CuMatrixBase< BaseFloat > &diff)
	Compute gradient and update parameters,. More...

Public Member Functions inherited from UpdatableComponent
	UpdatableComponent (int32 input_dim, int32 output_dim)

virtual	~UpdatableComponent ()

bool	IsUpdatable () const
	Check if contains trainable parameters,. More...

virtual void	SetTrainOptions (const NnetTrainOptions &opts)
	Set the training options to the component,. More...

const NnetTrainOptions &	GetTrainOptions () const
	Get the training options from the component,. More...

virtual void	SetLearnRateCoef (BaseFloat val)
	Set the learn-rate coefficient,. More...

virtual void	SetBiasLearnRateCoef (BaseFloat val)
	Set the learn-rate coefficient for bias,. More...

Public Member Functions inherited from Component
	Component (int32 input_dim, int32 output_dim)
	Generic interface of a component,. More...

virtual	~Component ()

virtual bool	IsMultistream () const
	Check if component has 'Recurrent' interface (trainable and recurrent),. More...

int32	InputDim () const
	Get the dimension of the input,. More...

int32	OutputDim () const
	Get the dimension of the output,. More...

void	Propagate (const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out)
	Perform forward-pass propagation 'in' -> 'out',. More...

void	Backpropagate (const CuMatrixBase< BaseFloat > &in, const CuMatrixBase< BaseFloat > &out, const CuMatrixBase< BaseFloat > &out_diff, CuMatrix< BaseFloat > *in_diff)
	Perform backward-pass propagation 'out_diff' -> 'in_diff'. More...

void	Write (std::ostream &os, bool binary) const
	Write the component to a stream,. More...

Private Attributes
int32	patch_dim_
	number of consecutive inputs, 1st dim of patch More...

int32	patch_step_
	step of the convolution (i.e. More...

int32	patch_stride_
	shift for 2nd dim of a patch More...

CuMatrix< BaseFloat >	filters_
	(i.e. frame length before splicing) More...

CuVector< BaseFloat >	bias_
	bias for each filter More...

CuMatrix< BaseFloat >	filters_grad_
	gradient of filters More...

CuVector< BaseFloat >	bias_grad_
	gradient of biases More...

BaseFloat	max_norm_
	limit L2 norm of a neuron weights to positive value More...

CuMatrix< BaseFloat >	vectorized_feature_patches_
	Buffer of reshaped inputs: 1row = vectorized rectangular feature patches, 1col = dim over speech frames Map of input features: std::vector-dim = patch-position. More...

std::vector< int32 >	column_map_

CuMatrix< BaseFloat >	feature_patch_diffs_
	Buffer for backpropagation: derivatives in the domain of 'vectorized_feature_patches_', 1row = vectorized rectangular feature patches, 1col = dim over speech frames,. More...

Additional Inherited Members
Public Types inherited from Component
enum	ComponentType { kUnknown = 0x0, kUpdatableComponent = 0x0100, kAffineTransform, kLinearTransform, kConvolutionalComponent, kLstmProjected, kBlstmProjected, kRecurrentComponent, kActivationFunction = 0x0200, kSoftmax, kHiddenSoftmax, kBlockSoftmax, kSigmoid, kTanh, kParametricRelu, kDropout, kLengthNormComponent, kTranform = 0x0400, kRbm, kSplice, kCopy, kTranspose, kBlockLinearity, kAddShift, kRescale, kKlHmm = 0x0800, kSentenceAveragingComponent, kSimpleSentenceAveragingComponent, kAveragePoolingComponent, kMaxPoolingComponent, kFramePoolingComponent, kParallelComponent, kMultiBasisComponent }
	Component type identification mechanism,. More...

Static Public Member Functions inherited from Component
static const char *	TypeToMarker (ComponentType t)
	Converts component type to marker,. More...

static ComponentType	MarkerToType (const std::string &s)
	Converts marker to component type (case insensitive),. More...

static Component *	Init (const std::string &conf_line)
	Initialize component from a line in config file,. More...

static Component *	Read (std::istream &is, bool binary)
	Read the component from a stream (static method),. More...

Static Public Attributes inherited from Component
static const struct key_value	kMarkerMap []
	The table with pairs of Component types and markers (defined in nnet-component.cc),. More...

Protected Attributes inherited from UpdatableComponent
NnetTrainOptions	opts_
	Option-class with training hyper-parameters,. More...

BaseFloat	learn_rate_coef_
	Scalar applied to learning rate for weight matrices (to be used in ::Update method),. More...

BaseFloat	bias_learn_rate_coef_
	Scalar applied to learning rate for bias (to be used in ::Update method),. More...

Protected Attributes inherited from Component
int32	input_dim_
	Data members,. More...

int32	output_dim_
	Dimension of the output of the Component,. More...

Detailed Description

ConvolutionalComponent implements convolution over single axis (i.e.

frequency axis in case we are the 1st component in NN). We don't do convolution along temporal axis, which simplifies the implementation (and was not helpful for Tara).

We assume the input featrues are spliced, i.e. each frame is in fact a set of stacked frames, where we can form patches which span over several frequency bands and whole time axis.

The convolution is done over whole axis with same filters, i.e. we don't use separate filters for different 'regions' of frequency axis.

In order to have a fast implementations, the filters are represented in vectorized form, where each rectangular filter corresponds to a row in a matrix, where all the filters are stored. The features are then re-shaped to a set of matrices, where one matrix corresponds to single patch-position, where all the filters get applied.

The type of convolution is controled by hyperparameters: patch_dim_ ... frequency axis size of the patch patch_step_ ... size of shift in the convolution patch_stride_ ... shift for 2nd dim of a patch (i.e. frame length before splicing)

Due to convolution same weights are used repeateadly, the final gradient is a sum of all position-specific gradients (the sum was found better than averaging).

Definition at line 66 of file nnet-convolutional-component.h.

Constructor & Destructor Documentation

◆ ConvolutionalComponent()

ConvolutionalComponent	(	int32	dim_in,
		int32	dim_out
	)

inline

Definition at line 68 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::Copy().

                                                      :
     UpdatableComponent(dim_in, dim_out),
     patch_dim_(0),
     patch_step_(0),
     patch_stride_(0),
     max_norm_(0.0)
   { }

◆ ~ConvolutionalComponent()

~ConvolutionalComponent ( )

inline

Definition at line 76 of file nnet-convolutional-component.h.

77 { }

Member Function Documentation

◆ BackpropagateFnc()

void BackpropagateFnc	(	const CuMatrixBase< BaseFloat > &	in,
		const CuMatrixBase< BaseFloat > &	out,
		const CuMatrixBase< BaseFloat > &	out_diff,
		CuMatrixBase< BaseFloat > *	in_diff
	)

inlinevirtual

Backward pass transformation (to be implemented by descending class...)

Implements Component.

Definition at line 368 of file nnet-convolutional-component.h.

References CuMatrixBase< Real >::AddCols(), CuMatrixBase< Real >::AddMatMat(), CuMatrixBase< Real >::ColRange(), ConvolutionalComponent::column_map_, ConvolutionalComponent::feature_patch_diffs_, ConvolutionalComponent::filters_, kaldi::kNoTrans, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, ConvolutionalComponent::RearrangeIndexes(), and ConvolutionalComponent::ReverseIndexes().

                                                           {
     // useful dims
     int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
     int32 num_filters = filters_.NumRows();
     int32 filter_dim = filters_.NumCols();
 
     // backpropagate to vector of matrices
     // (corresponding to position of a filter)
     for (int32 p = 0; p < num_patches; p++) {
       CuSubMatrix<BaseFloat> patch_diff(feature_patch_diffs_.ColRange(
                                         p * filter_dim, filter_dim));
       CuSubMatrix<BaseFloat> out_diff_patch(out_diff.ColRange(
                                             p * num_filters, num_filters));
       patch_diff.AddMatMat(1.0, out_diff_patch, kNoTrans,
                            filters_, kNoTrans, 0.0);
     }
 
     // sum the derivatives into in_diff, we will compensate #summands
     std::vector<std::vector<int32> > reversed_column_map;
     ReverseIndexes(column_map_, &reversed_column_map);
     std::vector<std::vector<int32> > rearranged_column_map;
     RearrangeIndexes(reversed_column_map, &rearranged_column_map);
     for (int32 p = 0; p < rearranged_column_map.size(); p++) {
       CuArray<int32> cu_cols(rearranged_column_map[p]);
       in_diff->AddCols(feature_patch_diffs_, cu_cols);
     }
   }

◆ Copy()

Component* Copy ( ) const

inlinevirtual

Copy component (deep copy),.

Implements Component.

Definition at line 79 of file nnet-convolutional-component.h.

References ConvolutionalComponent::ConvolutionalComponent().

79 { return new ConvolutionalComponent(*this); }

kaldi::nnet1::ConvolutionalComponent::ConvolutionalComponent

ConvolutionalComponent(int32 dim_in, int32 dim_out)

Definition: nnet-convolutional-component.h:68

◆ GetGradient()

void GetGradient ( VectorBase< BaseFloat > * gradient ) const

inlinevirtual

Get gradient reshaped as a vector,.

Implements UpdatableComponent.

Definition at line 221 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_, VectorBase< Real >::Dim(), ConvolutionalComponent::filters_, KALDI_ASSERT, ConvolutionalComponent::NumParams(), and VectorBase< Real >::Range().

                                                           {
     KALDI_ASSERT(gradient->Dim() == NumParams());
     int32 filters_num_elem = filters_.NumRows() * filters_.NumCols();
     gradient->Range(0, filters_num_elem).CopyRowsFromMat(filters_);
     gradient->Range(filters_num_elem, bias_.Dim()).CopyFromVec(bias_);
   }

◆ GetParams()

void GetParams ( VectorBase< BaseFloat > * params ) const

inlinevirtual

Get the trainable parameters reshaped as a vector,.

Implements UpdatableComponent.

Definition at line 228 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_, VectorBase< Real >::Dim(), ConvolutionalComponent::filters_, KALDI_ASSERT, ConvolutionalComponent::NumParams(), and VectorBase< Real >::Range().

                                                       {
     KALDI_ASSERT(params->Dim() == NumParams());
     int32 filters_num_elem = filters_.NumRows() * filters_.NumCols();
     params->Range(0, filters_num_elem).CopyRowsFromMat(filters_);
     params->Range(filters_num_elem, bias_.Dim()).CopyFromVec(bias_);
   }

◆ GetType()

ComponentType GetType ( ) const

inlinevirtual

Get Type Identification of the component,.

Implements Component.

Definition at line 80 of file nnet-convolutional-component.h.

References Component::kConvolutionalComponent.

80 { return kConvolutionalComponent; }

kaldi::nnet1::Component::kConvolutionalComponent

Definition: nnet-component.h:53

◆ Info()

std::string Info ( ) const

inlinevirtual

Print some additional info (after <ComponentName> and the dims),.

Reimplemented from Component.

Definition at line 242 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, ConvolutionalComponent::filters_, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, kaldi::nnet1::MomentStatistics(), and kaldi::nnet1::ToString().

                          {
     return std::string("\n  filters") + MomentStatistics(filters_) +
       ", lr-coef " + ToString(learn_rate_coef_) +
       ", max-norm " + ToString(max_norm_) +
       "\n  bias" + MomentStatistics(bias_) +
       ", lr-coef " + ToString(bias_learn_rate_coef_);
   }

◆ InfoGradient()

std::string InfoGradient ( ) const

inlinevirtual

Print some additional info about gradient (after <...> and dims),.

Reimplemented from Component.

Definition at line 250 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_grad_, UpdatableComponent::bias_learn_rate_coef_, ConvolutionalComponent::filters_grad_, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, kaldi::nnet1::MomentStatistics(), and kaldi::nnet1::ToString().

                                  {
     return std::string("\n  filters_grad") + MomentStatistics(filters_grad_) +
       ", lr-coef " + ToString(learn_rate_coef_) +
       ", max-norm " + ToString(max_norm_) +
       "\n  bias_grad" + MomentStatistics(bias_grad_) +
       ", lr-coef " + ToString(bias_learn_rate_coef_);
   }

◆ InitData()

void InitData ( std::istream & is )

inlinevirtual

Initialize the content of the component by the 'line' from the prototype,.

Implements UpdatableComponent.

Definition at line 82 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, ConvolutionalComponent::filters_, Component::input_dim_, KALDI_ASSERT, KALDI_ERR, KALDI_LOG, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, Component::output_dim_, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, kaldi::nnet1::RandGauss(), kaldi::nnet1::RandUniform(), kaldi::ReadBasicType(), and kaldi::ReadToken().

                                 {
     // define options
     BaseFloat bias_mean = -2.0, bias_range = 2.0, param_stddev = 0.1;
     // parse config
     std::string token;
     while (is >> std::ws, !is.eof()) {
       ReadToken(is, false, &token);
        if (token == "<ParamStddev>") ReadBasicType(is, false, &param_stddev);
       else if (token == "<BiasMean>")    ReadBasicType(is, false, &bias_mean);
       else if (token == "<BiasRange>")   ReadBasicType(is, false, &bias_range);
       else if (token == "<PatchDim>")    ReadBasicType(is, false, &patch_dim_);
       else if (token == "<PatchStep>")   ReadBasicType(is, false, &patch_step_);
       else if (token == "<PatchStride>") ReadBasicType(is, false, &patch_stride_);
       else if (token == "<LearnRateCoef>") ReadBasicType(is, false, &learn_rate_coef_);
       else if (token == "<BiasLearnRateCoef>") ReadBasicType(is, false, &bias_learn_rate_coef_);
       else if (token == "<MaxNorm>") ReadBasicType(is, false, &max_norm_);
       else KALDI_ERR << "Unknown token " << token << ", a typo in config?"
                      << " (ParamStddev|BiasMean|BiasRange|PatchDim|PatchStep|PatchStride)";
     }
 
     //
     // Sanity checks:
     //
     // splice (input are spliced frames):
     KALDI_ASSERT(input_dim_ % patch_stride_ == 0);
     int32 num_splice = input_dim_ / patch_stride_;
     KALDI_LOG << "num_splice " << num_splice;
     // number of patches:
     KALDI_ASSERT((patch_stride_ - patch_dim_) % patch_step_ == 0);
     int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
     KALDI_LOG << "num_patches " << num_patches;
     // filter dim:
     int32 filter_dim = num_splice * patch_dim_;
     KALDI_LOG << "filter_dim " << filter_dim;
     // num filters:
     KALDI_ASSERT(output_dim_ % num_patches == 0);
     int32 num_filters = output_dim_ / num_patches;
     KALDI_LOG << "num_filters " << num_filters;
     //
 
     //
     // Initialize trainable parameters,
     //
     // Gaussian with given std_dev (mean = 0),
     filters_.Resize(num_filters, filter_dim);
     RandGauss(0.0, param_stddev, &filters_);
     // Uniform,
     bias_.Resize(num_filters);
     RandUniform(bias_mean, bias_range, &bias_);
   }

◆ NumParams()

int32 NumParams ( ) const

inlinevirtual

Number of trainable parameters,.

Implements UpdatableComponent.

Definition at line 217 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_, and ConvolutionalComponent::filters_.

Referenced by ConvolutionalComponent::GetGradient(), ConvolutionalComponent::GetParams(), and ConvolutionalComponent::SetParams().

                           {
     return filters_.NumRows()*filters_.NumCols() + bias_.Dim();
   }

◆ PropagateFnc()

void PropagateFnc	(	const CuMatrixBase< BaseFloat > &	in,
		CuMatrixBase< BaseFloat > *	out
	)

inlinevirtual

Abstract interface for propagation/backpropagation.

Forward pass transformation (to be implemented by descending class...)

Implements Component.

Definition at line 258 of file nnet-convolutional-component.h.

                                                   {
     // useful dims
     int32 num_splice = input_dim_ / patch_stride_;
     int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
     int32 num_filters = filters_.NumRows();
     int32 num_frames = in.NumRows();
     int32 filter_dim = filters_.NumCols();
 
     // we will need the buffers
     if (vectorized_feature_patches_.NumRows() != num_frames) {
       vectorized_feature_patches_.Resize(num_frames, filter_dim * num_patches, kUndefined);
       feature_patch_diffs_.Resize(num_frames, filter_dim * num_patches, kSetZero);
     }
 
     /* Prepare feature patches, the layout is:
      * |----------|----------|----------|---------| (in = spliced frames)
      *   xxx        xxx        xxx        xxx       (x = selected elements)
      *
      *   xxx : patch dim
      *    xxx
      *   ^---: patch step
      * |----------| : patch stride
      *
      *   xxx-xxx-xxx-xxx : filter dim
      *
      */
     // build-up a column selection map:
     int32 index = 0;
     column_map_.resize(filter_dim * num_patches);
     for (int32 p = 0; p < num_patches; p++) {
       for (int32 s = 0; s < num_splice; s++) {
         for (int32 d = 0; d < patch_dim_; d++) {
           column_map_[index] = p * patch_step_ + s * patch_stride_ + d;
           index++;
         }
       }
     }
     // select the columns
     CuArray<int32> cu_column_map(column_map_);
     vectorized_feature_patches_.CopyCols(in, cu_column_map);
 
     // compute filter activations
     for (int32 p = 0; p < num_patches; p++) {
       CuSubMatrix<BaseFloat> tgt(out->ColRange(p * num_filters, num_filters));
       CuSubMatrix<BaseFloat> patch(vectorized_feature_patches_.ColRange(
                                    p * filter_dim, filter_dim));
       tgt.AddVecToRows(1.0, bias_, 0.0);  // add bias
       // apply all filters
       tgt.AddMatMat(1.0, patch, kNoTrans, filters_, kTrans, 1.0);
     }
   }

◆ ReadData()

void ReadData	(	std::istream &	is,
		bool	binary
	)

inlinevirtual

Reads the component content.

Reimplemented from Component.

Definition at line 133 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, kaldi::ExpectToken(), ConvolutionalComponent::filters_, Component::input_dim_, KALDI_ASSERT, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, Component::output_dim_, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, kaldi::PeekToken(), and kaldi::ReadBasicType().

                                              {
     // convolution hyperparameters,
     ExpectToken(is, binary, "<PatchDim>");
     ReadBasicType(is, binary, &patch_dim_);
     ExpectToken(is, binary, "<PatchStep>");
     ReadBasicType(is, binary, &patch_step_);
     ExpectToken(is, binary, "<PatchStride>");
     ReadBasicType(is, binary, &patch_stride_);
 
     // variant-length list of parameters,
     bool end_loop = false;
     while (!end_loop) {
       int first_char = PeekToken(is, binary);
       switch (first_char) {
         case 'L': ExpectToken(is, binary, "<LearnRateCoef>");
           ReadBasicType(is, binary, &learn_rate_coef_);
           break;
         case 'B': ExpectToken(is, binary, "<BiasLearnRateCoef>");
           ReadBasicType(is, binary, &bias_learn_rate_coef_);
           break;
         case 'M': ExpectToken(is, binary, "<MaxNorm>");
           ReadBasicType(is, binary, &max_norm_);
           break;
         case '!': ExpectToken(is, binary, "<!EndOfComponent>");
         default: end_loop = true;
       }
     }
 
     // trainable parameters
     ExpectToken(is, binary, "<Filters>");
     filters_.Read(is, binary);
     ExpectToken(is, binary, "<Bias>");
     bias_.Read(is, binary);
 
     //
     // Sanity checks:
     //
     // splice (input are spliced frames):
     KALDI_ASSERT(input_dim_ % patch_stride_ == 0);
     int32 num_splice = input_dim_ / patch_stride_;
     // number of patches:
     KALDI_ASSERT((patch_stride_ - patch_dim_) % patch_step_ == 0);
     int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
     // filter dim:
     int32 filter_dim = num_splice * patch_dim_;
     // num filters:
     KALDI_ASSERT(output_dim_ % num_patches == 0);
     int32 num_filters = output_dim_ / num_patches;
     // check parameter dims:
     KALDI_ASSERT(num_filters == filters_.NumRows());
     KALDI_ASSERT(num_filters == bias_.Dim());
     KALDI_ASSERT(filter_dim == filters_.NumCols());
     //
   }

◆ RearrangeIndexes()

void RearrangeIndexes	(	const std::vector< std::vector< int32 > > &	in,
		std::vector< std::vector< int32 > > *	out
	)

inline

Definition at line 351 of file nnet-convolutional-component.h.

References rnnlm::i, and rnnlm::j.

Referenced by ConvolutionalComponent::BackpropagateFnc().

                                                            {
     int32 D = in.size();
     int32 L = 0;
     for (int32 i = 0; i < D; i++)
       if (in[i].size() > L)
         L = in[i].size();
     out->resize(L);
     for (int32 i = 0; i < L; i++)
       (*out)[i].resize(D, -1);
     for (int32 i = 0; i < D; i++) {
       for (int32 j = 0; j < in[i].size(); j++) {
         (*out)[j][i] = in[i][j];
       }
     }
   }

◆ ReverseIndexes()

void ReverseIndexes	(	const std::vector< int32 > &	forward_indexes,
		std::vector< std::vector< int32 > > *	backward_indexes
	)

inline

Definition at line 323 of file nnet-convolutional-component.h.

References rnnlm::i, Component::input_dim_, rnnlm::j, and KALDI_ASSERT.

Referenced by ConvolutionalComponent::BackpropagateFnc().

                                                                       {
     int32 i;
     int32 size = forward_indexes.size();
     backward_indexes->resize(input_dim_);
     int32 reserve_size = 2+ forward_indexes.size() / input_dim_;
     std::vector<std::vector<int32> >::iterator iter = backward_indexes->begin(),
       end = backward_indexes->end();
     for (; iter != end; ++iter)
       iter->reserve(reserve_size);
     for (int32 j = 0; j < size; j++) {
       i = forward_indexes[j];
       KALDI_ASSERT(i < input_dim_);
       (*backward_indexes)[i].push_back(j);
     }
   }

◆ SetParams()

void SetParams ( const VectorBase< BaseFloat > & params )

inlinevirtual

Set the trainable parameters from, reshaped as a vector,.

Implements UpdatableComponent.

Definition at line 235 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_, VectorBase< Real >::Dim(), ConvolutionalComponent::filters_, KALDI_ASSERT, ConvolutionalComponent::NumParams(), and VectorBase< Real >::Range().

                                                       {
     KALDI_ASSERT(params.Dim() == NumParams());
     int32 filters_num_elem = filters_.NumRows() * filters_.NumCols();
     filters_.CopyRowsFromVec(params.Range(0, filters_num_elem));
     bias_.CopyFromVec(params.Range(filters_num_elem, bias_.Dim()));
   }

◆ Update()

void Update	(	const CuMatrixBase< BaseFloat > &	input,
		const CuMatrixBase< BaseFloat > &	diff
	)

inlinevirtual

Compute gradient and update parameters,.

Implements UpdatableComponent.

Definition at line 400 of file nnet-convolutional-component.h.

                                                    {
     // useful dims
     int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
     int32 num_filters = filters_.NumRows();
     int32 filter_dim = filters_.NumCols();
 
     // we use following hyperparameters from the option class
     const BaseFloat lr = opts_.learn_rate;
 
     //
     // calculate the gradient
     //
     filters_grad_.Resize(num_filters, filter_dim, kSetZero);  // reset
     bias_grad_.Resize(num_filters, kSetZero);  // reset
     // use all the patches
     for (int32 p = 0; p < num_patches; p++) {  // sum
       CuSubMatrix<BaseFloat> diff_patch(diff.ColRange(p * num_filters,
                                                       num_filters));
       CuSubMatrix<BaseFloat> patch(vectorized_feature_patches_.ColRange(
                                    p * filter_dim, filter_dim));
       filters_grad_.AddMatMat(1.0, diff_patch, kTrans, patch, kNoTrans, 1.0);
       bias_grad_.AddRowSumMat(1.0, diff_patch, 1.0);
     }
 
     //
     // update
     //
     filters_.AddMat(-lr*learn_rate_coef_, filters_grad_);
     bias_.AddVec(-lr*bias_learn_rate_coef_, bias_grad_);
     //
 
     // max-norm
     if (max_norm_ > 0.0) {
       CuMatrix<BaseFloat> lin_sqr(filters_);
       lin_sqr.MulElements(filters_);
       CuVector<BaseFloat> l2(filters_.NumRows());
       l2.AddColSumMat(1.0, lin_sqr, 0.0);
       l2.ApplyPow(0.5);  // we have per-neuron L2 norms
       CuVector<BaseFloat> scl(l2);
       scl.Scale(1.0/max_norm_);
       scl.ApplyFloor(1.0);
       scl.InvertElements();
       filters_.MulRowsVec(scl);  // shink to sphere!
     }
   }

◆ WriteData()

void WriteData	(	std::ostream &	os,
		bool	binary
	)		const

inlinevirtual

Writes the component content.

Reimplemented from Component.

Definition at line 188 of file nnet-convolutional-component.h.

References ConvolutionalComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, ConvolutionalComponent::filters_, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, kaldi::WriteBasicType(), and kaldi::WriteToken().

                                                     {
     // convolution hyperparameters
     WriteToken(os, binary, "<PatchDim>");
     WriteBasicType(os, binary, patch_dim_);
     WriteToken(os, binary, "<PatchStep>");
     WriteBasicType(os, binary, patch_step_);
     WriteToken(os, binary, "<PatchStride>");
     WriteBasicType(os, binary, patch_stride_);
     if (!binary) os << "\n";
 
     // re-scale learn rate
     WriteToken(os, binary, "<LearnRateCoef>");
     WriteBasicType(os, binary, learn_rate_coef_);
     WriteToken(os, binary, "<BiasLearnRateCoef>");
     WriteBasicType(os, binary, bias_learn_rate_coef_);
     // max-norm regularization
     WriteToken(os, binary, "<MaxNorm>");
     WriteBasicType(os, binary, max_norm_);
     if (!binary) os << "\n";
 
     // trainable parameters
     WriteToken(os, binary, "<Filters>");
     if (!binary) os << "\n";
     filters_.Write(os, binary);
     WriteToken(os, binary, "<Bias>");
     if (!binary) os << "\n";
     bias_.Write(os, binary);
   }

Member Data Documentation

◆ bias_

CuVector<BaseFloat> bias_

private

bias for each filter

Definition at line 455 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::GetGradient(), ConvolutionalComponent::GetParams(), ConvolutionalComponent::Info(), ConvolutionalComponent::InitData(), ConvolutionalComponent::NumParams(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::SetParams(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().

◆ bias_grad_

CuVector<BaseFloat> bias_grad_

private

gradient of biases

Definition at line 458 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::InfoGradient(), and ConvolutionalComponent::Update().

◆ column_map_

std::vector<int32> column_map_

private

Definition at line 469 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::BackpropagateFnc(), and ConvolutionalComponent::PropagateFnc().

◆ feature_patch_diffs_

CuMatrix<BaseFloat> feature_patch_diffs_

private

Buffer for backpropagation: derivatives in the domain of 'vectorized_feature_patches_', 1row = vectorized rectangular feature patches, 1col = dim over speech frames,.

Definition at line 476 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::BackpropagateFnc(), and ConvolutionalComponent::PropagateFnc().

◆ filters_

CuMatrix<BaseFloat> filters_

private

(i.e. frame length before splicing)

row = vectorized rectangular filter

Definition at line 454 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::BackpropagateFnc(), ConvolutionalComponent::GetGradient(), ConvolutionalComponent::GetParams(), ConvolutionalComponent::Info(), ConvolutionalComponent::InitData(), ConvolutionalComponent::NumParams(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::SetParams(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().

◆ filters_grad_

CuMatrix<BaseFloat> filters_grad_

private

gradient of filters

Definition at line 457 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::InfoGradient(), and ConvolutionalComponent::Update().

◆ max_norm_

BaseFloat max_norm_

private

limit L2 norm of a neuron weights to positive value

Definition at line 460 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::Info(), ConvolutionalComponent::InfoGradient(), ConvolutionalComponent::InitData(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().

◆ patch_dim_

int32 patch_dim_

private

number of consecutive inputs, 1st dim of patch

Definition at line 448 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::BackpropagateFnc(), ConvolutionalComponent::InitData(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().

◆ patch_step_

int32 patch_step_

private

step of the convolution (i.e.

shift between 2 patches)

Definition at line 448 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::BackpropagateFnc(), ConvolutionalComponent::InitData(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().

◆ patch_stride_

int32 patch_stride_

private

shift for 2nd dim of a patch

Definition at line 448 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::BackpropagateFnc(), ConvolutionalComponent::InitData(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().

◆ vectorized_feature_patches_

CuMatrix<BaseFloat> vectorized_feature_patches_

private

Buffer of reshaped inputs: 1row = vectorized rectangular feature patches, 1col = dim over speech frames Map of input features: std::vector-dim = patch-position.

Definition at line 468 of file nnet-convolutional-component.h.

Referenced by ConvolutionalComponent::PropagateFnc(), and ConvolutionalComponent::Update().

The documentation for this class was generated from the following file:

nnet/nnet-convolutional-component.h

Public Member Functions

Private Attributes

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

◆ ConvolutionalComponent()

◆ ~ConvolutionalComponent()

Member Function Documentation

◆ BackpropagateFnc()

◆ Copy()

◆ GetGradient()

◆ GetParams()

◆ GetType()

◆ Info()

◆ InfoGradient()

◆ InitData()

◆ NumParams()

◆ PropagateFnc()

◆ ReadData()

◆ RearrangeIndexes()

◆ ReverseIndexes()

◆ SetParams()

◆ Update()

◆ WriteData()

Member Data Documentation

◆ bias_

◆ bias_grad_

◆ column_map_

◆ feature_patch_diffs_

◆ filters_

◆ filters_grad_

◆ max_norm_

◆ patch_dim_

◆ patch_step_

◆ patch_stride_

◆ vectorized_feature_patches_