Convolutional1dComponent implements convolution over frequency axis. More...

#include <nnet-component.h>

Inheritance diagram for Convolutional1dComponent:

Collaboration diagram for Convolutional1dComponent:

[legend]

Public Member Functions
	Convolutional1dComponent ()

	Convolutional1dComponent (const Convolutional1dComponent &component)

	Convolutional1dComponent (const CuMatrixBase< BaseFloat > &filter_params, const CuVectorBase< BaseFloat > &bias_params, BaseFloat learning_rate)

int32	InputDim () const
	Get size of input vectors. More...

int32	OutputDim () const
	Get size of output vectors. More...

void	Init (BaseFloat learning_rate, int32 input_dim, int32 output_dim, int32 patch_dim, int32 patch_step, int32 patch_stride, BaseFloat param_stddev, BaseFloat bias_stddev, bool appended_conv)

void	Init (BaseFloat learning_rate, int32 patch_dim, int32 patch_step, int32 patch_stride, std::string matrix_filename, bool appended_conv)

void	Resize (int32 input_dim, int32 output_dim)

std::string	Info () const

void	InitFromString (std::string args)
	Initialize, typically from a line of a config file. More...

std::string	Type () const

bool	BackpropNeedsInput () const

bool	BackpropNeedsOutput () const

void	Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
	Perform forward pass propagation Input->Output. More...

void	Scale (BaseFloat scale)
	This new virtual function scales the parameters by this amount. More...

virtual void	Add (BaseFloat alpha, const UpdatableComponent &other)
	This new virtual function adds the parameters of another updatable component, times some constant, to the current parameters. More...

virtual void	Backprop (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, Component to_update_in, CuMatrix< BaseFloat > in_deriv) const
	Perform backward pass propagation of the derivative, and also either update the model (if to_update == this) or update another model or compute the model derivative (otherwise). More...

void	SetZero (bool treat_as_gradient)
	Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set the learning rate to 1 and make any other changes necessary (there's a variable we have to set for the MixtureProbComponent). More...

void	Read (std::istream &is, bool binary)

void	Write (std::ostream &os, bool binary) const
	Write component to stream. More...

virtual BaseFloat	DotProduct (const UpdatableComponent &other) const
	Here, "other" is a component of the same specific type. More...

Component *	Copy () const
	Copy component (deep copy). More...

void	PerturbParams (BaseFloat stddev)
	We introduce a new virtual function that only applies to class UpdatableComponent. More...

void	SetParams (const VectorBase< BaseFloat > &bias, const MatrixBase< BaseFloat > &filter)

const CuVector< BaseFloat > &	BiasParams ()

const CuMatrix< BaseFloat > &	LinearParams ()

int32	GetParameterDim () const
	The following new virtual function returns the total dimension of the parameters in this class. More...

void	Update (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)

Public Member Functions inherited from UpdatableComponent
	UpdatableComponent (const UpdatableComponent &other)

void	Init (BaseFloat learning_rate)

	UpdatableComponent (BaseFloat learning_rate)

	UpdatableComponent ()

virtual	~UpdatableComponent ()

void	SetLearningRate (BaseFloat lrate)
	Sets the learning rate of gradient descent. More...

BaseFloat	LearningRate () const
	Gets the learning rate of gradient descent. More...

virtual void	Vectorize (VectorBase< BaseFloat > *params) const
	Turns the parameters into vector form. More...

virtual void	UnVectorize (const VectorBase< BaseFloat > &params)
	Converts the parameters from vector form. More...

Public Member Functions inherited from Component
	Component ()

virtual int32	Index () const
	Returns the index in the sequence of layers in the neural net; intended only to be used in debugging information. More...

virtual void	SetIndex (int32 index)

virtual std::vector< int32 >	Context () const
	Return a vector describing the temporal context this component requires for each frame of output, as a sorted list. More...

void	Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out) const
	A non-virtual propagate function that first resizes output if necessary. More...

virtual	~Component ()

Private Member Functions
const Convolutional1dComponent &	operator= (const Convolutional1dComponent &other)

Static Private Member Functions
static void	ReverseIndexes (const std::vector< int32 > &forward_indexes, int32 input_dim, std::vector< std::vector< int32 > > *backward_indexes)

static void	RearrangeIndexes (const std::vector< std::vector< int32 > > &in, std::vector< std::vector< int32 > > *out)

Private Attributes
int32	patch_dim_

int32	patch_step_

int32	patch_stride_

CuMatrix< BaseFloat >	filter_params_

CuVector< BaseFloat >	bias_params_

bool	appended_conv_

bool	is_gradient_

Additional Inherited Members
Static Public Member Functions inherited from Component
static Component *	ReadNew (std::istream &is, bool binary)
	Read component from stream. More...

static Component *	NewFromString (const std::string &initializer_line)
	Initialize the Component from one line that will contain first the type, e.g. More...

static Component *	NewComponentOfType (const std::string &type)
	Return a new Component of the given type e.g. More...

Protected Attributes inherited from UpdatableComponent
BaseFloat	learning_rate_
	learning rate (0.0..0.01) More...

Detailed Description

Convolutional1dComponent implements convolution over frequency axis.

We assume the input featrues are spliced, i.e. each frame is in fact a set of stacked frames, where we can form patches which span over several frequency bands and whole time axis. A patch is the instance of a filter on a group of frequency bands and whole time axis. Shifts of the filter generate patches.

The convolution is done over whole axis with same filter coefficients, i.e. we don't use separate filters for different 'regions' of frequency axis. Due to convolution, same weights are used repeateadly, the final gradient is a sum of all position-specific gradients (the sum was found better than averaging).

In order to have a fast implementations, the filters are represented in vectorized form, where each rectangular filter corresponds to a row in a matrix, where all the filters are stored. The features are then re-shaped to a set of matrices, where one matrix corresponds to single patch-position, where all the filters get applied.

The type of convolution is controled by hyperparameters: patch_dim_ ... frequency axis size of the patch patch_step_ ... size of shift in the convolution patch_stride_ ... shift for 2nd dim of a patch (i.e. frame length before splicing) For instance, for a convolutional component after raw input, if the input is 36-dim fbank feature with delta of order 2 and spliced using +/- 5 frames of contexts, the convolutional component takes the input as a 36 x 33 image. The patch_stride_ should be configured 36. If patch_step_ and patch_dim_ are configured 1 and 7, the Convolutional1dComponent creates a 2D filter of 7 x 33, such that the convolution is actually done only along the frequency axis. Specifically, the convolutional output along the frequency axis is (36 - 7) / 1 + 1 = 30, and the convolutional output along the temporal axis is 33 - 33 + 1 = 1, resulting in an output image of 30 x 1, which is called a feature map in ConvNet. Then if the output-dim is set 3840, the constructor would know there should be 3840 / 30 = 128 distinct filters, which will create 128 feature maps of 30 x 1 for one frame of input. The feature maps are vectorized as a 3840-dim row vector in the output matrix of this component. For details on progatation of Convolutional1dComponent, check the function definition.

Definition at line 1718 of file nnet-component.h.

Constructor & Destructor Documentation

◆ Convolutional1dComponent() [1/3]

Convolutional1dComponent ( )

Definition at line 3630 of file nnet-component.cc.

Referenced by Convolutional1dComponent::Copy().

                                                   :
     UpdatableComponent(),
     patch_dim_(0), patch_step_(0), patch_stride_(0),
     appended_conv_(false), is_gradient_(false) {}

◆ Convolutional1dComponent() [2/3]

Convolutional1dComponent ( const Convolutional1dComponent & component )

Definition at line 3635 of file nnet-component.cc.

                                                                                            :
     UpdatableComponent(component),
     filter_params_(component.filter_params_),
     bias_params_(component.bias_params_),
     appended_conv_(component.appended_conv_),
     is_gradient_(component.is_gradient_) {}

◆ Convolutional1dComponent() [3/3]

Convolutional1dComponent	(	const CuMatrixBase< BaseFloat > &	filter_params,
		const CuVectorBase< BaseFloat > &	bias_params,
		BaseFloat	learning_rate
	)

Definition at line 3642 of file nnet-component.cc.

References Convolutional1dComponent::appended_conv_, CuVectorBase< Real >::Dim(), Convolutional1dComponent::is_gradient_, KALDI_ASSERT, and CuMatrixBase< Real >::NumRows().

                                                                            :
     UpdatableComponent(learning_rate),
     filter_params_(filter_params),
     bias_params_(bias_params) {
   KALDI_ASSERT(filter_params.NumRows() == bias_params.Dim() &&
                bias_params.Dim() != 0);
   appended_conv_ = false;
   is_gradient_ = false;
 }

Member Function Documentation

◆ Add()

void Add	(	BaseFloat	alpha,
		const UpdatableComponent &	other
	)

virtual

This new virtual function adds the parameters of another updatable component, times some constant, to the current parameters.

Implements UpdatableComponent.

Definition at line 3900 of file nnet-component.cc.

References Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, and KALDI_ASSERT.

                                                                                       {
   const Convolutional1dComponent *other =
       dynamic_cast<const Convolutional1dComponent*>(&other_in);
   KALDI_ASSERT(other != NULL);
   filter_params_.AddMat(alpha, other->filter_params_);
   bias_params_.AddVec(alpha, other->bias_params_);
 }

◆ Backprop()

void Backprop	(	const ChunkInfo &	in_info,
		const ChunkInfo &	out_info,
		const CuMatrixBase< BaseFloat > &	in_value,
		const CuMatrixBase< BaseFloat > &	out_value,
		const CuMatrixBase< BaseFloat > &	out_deriv,
		Component *	to_update,
		CuMatrix< BaseFloat > *	in_deriv
	)		const

virtual

Perform backward pass propagation of the derivative, and also either update the model (if to_update == this) or update another model or compute the model derivative (otherwise).

Note: in_value and out_value are the values of the input and output of the component, and these may be dummy variables if respectively BackpropNeedsInput() or BackpropNeedsOutput() return false for that component (not all components need these).

num_chunks lets us treat the input matrix as contiguous-in-time chunks of equal size; it only matters if splicing is involved.

Buffer for backpropagation: derivatives in the domain of 'patches_', 1row = vectorized rectangular feature patches, 1col = dim over speech frames,

Implements Component.

Definition at line 3966 of file nnet-component.cc.

                                                                              {
   in_deriv->Resize(out_deriv.NumRows(), InputDim());
   Convolutional1dComponent *to_update = dynamic_cast<Convolutional1dComponent*>(to_update_in);
   int32 num_splice = InputDim() / patch_stride_;
   int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
   int32 num_filters = filter_params_.NumRows();
   int32 num_frames = out_deriv.NumRows();
   int32 filter_dim = filter_params_.NumCols();
 
   CuMatrix<BaseFloat> patches_deriv(num_frames, filter_dim * num_patches, kSetZero);
 
   //
   // backpropagate to vector of matrices
   // (corresponding to position of a filter)
   //
   std::vector<CuSubMatrix<BaseFloat>* > patch_deriv_batch, out_deriv_batch,
       filter_params_batch;
 
   CuSubMatrix<BaseFloat>* filter_params_elem = new CuSubMatrix<BaseFloat>(
       filter_params_, 0, filter_params_.NumRows(), 0, filter_params_.NumCols());
 
   // form batch in vector container
   for (int32 p = 0; p < num_patches; p++) {
     // form batch in vector container. for filter_params_batch, all elements
     // point to the same copy filter_params_elem
     patch_deriv_batch.push_back(new CuSubMatrix<BaseFloat>(patches_deriv.ColRange(
         p * filter_dim, filter_dim)));
     out_deriv_batch.push_back(new CuSubMatrix<BaseFloat>(out_deriv.ColRange(
         p * num_filters, num_filters)));
     filter_params_batch.push_back(filter_params_elem);
   }
   AddMatMatBatched<BaseFloat>(1.0, patch_deriv_batch, out_deriv_batch, kNoTrans,
                               filter_params_batch, kNoTrans, 0.0);
 
   // release memory
   delete filter_params_elem;
   for (int32 p = 0; p < num_patches; p++) {
     delete patch_deriv_batch[p];
     delete out_deriv_batch[p];
   }
 
   // sum the derivatives into in_deriv
   std::vector<int32> column_map(filter_dim * num_patches);
   for (int32 patch = 0, index = 0; patch < num_patches; patch++) {
     int32 fstride = patch * patch_step_;
     for (int32 splice = 0; splice < num_splice; splice++) {
       int32 cstride = splice * patch_stride_;
       for (int32 d = 0; d < patch_dim_; d++, index++) {
         if (appended_conv_)
           column_map[index] = (fstride + d) * num_splice + splice;
         else
           column_map[index] = fstride + cstride + d;
       }
     }
   }
   std::vector<std::vector<int32> > reversed_column_map;
   ReverseIndexes(column_map, InputDim(), &reversed_column_map);
   std::vector<std::vector<int32> > rearranged_column_map;
   RearrangeIndexes(reversed_column_map, &rearranged_column_map);
   for (int32 p = 0; p < rearranged_column_map.size(); p++) {
     CuArray<int32> cu_cols(rearranged_column_map[p]);
     in_deriv->AddCols(patches_deriv, cu_cols);
   }
 
   if (to_update != NULL) {
     // Next update the model (must do this 2nd so the derivatives we propagate
     // are accurate, in case this == to_update_in.)
     to_update->Update(in_value, out_deriv);
   }
 }

◆ BackpropNeedsInput()

bool BackpropNeedsInput ( ) const

inlinevirtual

Reimplemented from Component.

Definition at line 1743 of file nnet-component.h.

1743 { return true; }

◆ BackpropNeedsOutput()

bool BackpropNeedsOutput ( ) const

inlinevirtual

Reimplemented from Component.

Definition at line 1744 of file nnet-component.h.

References kaldi::cu::Copy(), kaldi::nnet3::DotProduct(), kaldi::nnet3::PerturbParams(), and Component::Propagate().

1744 { return false; }

◆ BiasParams()

const CuVector<BaseFloat>& BiasParams ( )

inline

Definition at line 1767 of file nnet-component.h.

1767 { return bias_params_; }

kaldi::nnet2::Convolutional1dComponent::bias_params_

CuVector< BaseFloat > bias_params_

Definition: nnet-component.h:1786

◆ Copy()

Component * Copy ( ) const

virtual

Copy component (deep copy).

Implements Component.

Definition at line 4127 of file nnet-component.cc.

References Convolutional1dComponent::appended_conv_, Convolutional1dComponent::bias_params_, Convolutional1dComponent::Convolutional1dComponent(), Convolutional1dComponent::filter_params_, Convolutional1dComponent::is_gradient_, UpdatableComponent::learning_rate_, Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, and Convolutional1dComponent::patch_stride_.

                                                 {
   Convolutional1dComponent *ans = new Convolutional1dComponent();
   ans->learning_rate_ = learning_rate_;
   ans->patch_dim_ = patch_dim_;
   ans->patch_step_ = patch_step_;
   ans->patch_stride_ = patch_stride_;
   ans->filter_params_ = filter_params_;
   ans->bias_params_ = bias_params_;
   ans->appended_conv_ = appended_conv_;
   ans->is_gradient_ = is_gradient_;
   return ans;
 }

◆ DotProduct()

BaseFloat DotProduct ( const UpdatableComponent & other ) const

virtual

Here, "other" is a component of the same specific type.

This function computes the dot product in parameters, and is computed while automatically adjusting learning rates; typically, one of the two will actually contain the gradient.

Implements UpdatableComponent.

Definition at line 4120 of file nnet-component.cc.

References Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, kaldi::kTrans, kaldi::TraceMatMat(), and kaldi::VecVec().

                                                                                        {
   const Convolutional1dComponent *other =
       dynamic_cast<const Convolutional1dComponent*>(&other_in);
   return TraceMatMat(filter_params_, other->filter_params_, kTrans)
          + VecVec(bias_params_, other->bias_params_);
 }

◆ GetParameterDim()

int32 GetParameterDim ( ) const

virtual

The following new virtual function returns the total dimension of the parameters in this class.

E.g. used for L-BFGS update

Reimplemented from UpdatableComponent.

Definition at line 4157 of file nnet-component.cc.

References Convolutional1dComponent::filter_params_.

                                                       {
   return (filter_params_.NumCols() + 1) * filter_params_.NumRows();
 }

◆ Info()

std::string Info ( ) const

virtual

Reimplemented from UpdatableComponent.

Definition at line 3732 of file nnet-component.cc.

References Convolutional1dComponent::appended_conv_, Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, Convolutional1dComponent::InputDim(), kaldi::kTrans, UpdatableComponent::LearningRate(), Convolutional1dComponent::OutputDim(), Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, Convolutional1dComponent::patch_stride_, kaldi::TraceMatMat(), Convolutional1dComponent::Type(), and kaldi::VecVec().

                                                {
   std::stringstream stream;
   BaseFloat filter_params_size = static_cast<BaseFloat>(filter_params_.NumRows())
                                  * static_cast<BaseFloat>(filter_params_.NumCols());
   BaseFloat filter_stddev =
             std::sqrt(TraceMatMat(filter_params_, filter_params_, kTrans) /
                       filter_params_size),
             bias_stddev = std::sqrt(VecVec(bias_params_, bias_params_) /
                                     bias_params_.Dim());
 
   int32 num_splice = InputDim() / patch_stride_;
   int32 filter_dim = num_splice * patch_dim_;
   int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
   int32 num_filters = OutputDim() / num_patches;
 
   stream << Type() << ", input-dim=" << InputDim()
          << ", output-dim=" << OutputDim()
          << ", num-splice=" << num_splice
          << ", num-patches=" << num_patches
          << ", num-filters=" << num_filters
          << ", filter-dim=" << filter_dim
          << ", filter-params-stddev=" << filter_stddev
          << ", bias-params-stddev=" << bias_stddev
          << ", appended-conv=" << appended_conv_
          << ", learning-rate=" << LearningRate();
   return stream.str();
 }

◆ Init() [1/2]

void Init	(	BaseFloat	learning_rate,
		int32	input_dim,
		int32	output_dim,
		int32	patch_dim,
		int32	patch_step,
		int32	patch_stride,
		BaseFloat	param_stddev,
		BaseFloat	bias_stddev,
		bool	appended_conv
	)

Definition at line 3669 of file nnet-component.cc.

References Convolutional1dComponent::appended_conv_, Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, UpdatableComponent::Init(), KALDI_ASSERT, Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, and Convolutional1dComponent::patch_stride_.

Referenced by MaxpoolingComponent::InitFromString(), Convolutional1dComponent::InitFromString(), and kaldi::nnet2::UnitTestConvolutional1dComponent().

                                                                                {
   UpdatableComponent::Init(learning_rate);
   patch_dim_ = patch_dim;
   patch_step_ = patch_step;
   patch_stride_ = patch_stride;
   appended_conv_ = appended_conv;
   int32 num_splice = input_dim / patch_stride;
   int32 filter_dim = num_splice * patch_dim;
   int32 num_patches = 1 + (patch_stride - patch_dim) / patch_step;
   int32 num_filters = output_dim / num_patches;
   KALDI_ASSERT(input_dim % patch_stride == 0);
   KALDI_ASSERT((patch_stride - patch_dim) % patch_step == 0);
   KALDI_ASSERT(output_dim % num_patches == 0);
 
   filter_params_.Resize(num_filters, filter_dim);
   bias_params_.Resize(num_filters);
   KALDI_ASSERT(param_stddev >= 0.0 && bias_stddev >= 0.0);
   filter_params_.SetRandn();
   filter_params_.Scale(param_stddev);
   bias_params_.SetRandn();
   bias_params_.Scale(bias_stddev);
 }

◆ Init() [2/2]

void Init	(	BaseFloat	learning_rate,
		int32	patch_dim,
		int32	patch_step,
		int32	patch_stride,
		std::string	matrix_filename,
		bool	appended_conv
	)

Definition at line 3697 of file nnet-component.cc.

References Convolutional1dComponent::appended_conv_, Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, UpdatableComponent::Init(), KALDI_ASSERT, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, Convolutional1dComponent::patch_stride_, CuMatrixBase< Real >::Range(), and kaldi::ReadKaldiObject().

                                                         {
   UpdatableComponent::Init(learning_rate);
   patch_dim_ = patch_dim;
   patch_step_ = patch_step;
   patch_stride_ = patch_stride;
   appended_conv_ = appended_conv;
   CuMatrix<BaseFloat> mat;
   ReadKaldiObject(matrix_filename, &mat);
   KALDI_ASSERT(mat.NumCols() >= 2);
   int32 filter_dim = mat.NumCols() - 1, num_filters = mat.NumRows();
   filter_params_.Resize(num_filters, filter_dim);
   bias_params_.Resize(num_filters);
   filter_params_.CopyFromMat(mat.Range(0, num_filters, 0, filter_dim));
   bias_params_.CopyColFromMat(mat, filter_dim);
 }

◆ InitFromString()

void InitFromString ( std::string args )

virtual

Initialize, typically from a line of a config file.

The "args" will contain any parameters that need to be passed to the Component, e.g. dimensions.

Implements Component.

Definition at line 3761 of file nnet-component.cc.

References Convolutional1dComponent::Init(), Convolutional1dComponent::InputDim(), KALDI_ASSERT, KALDI_ERR, UpdatableComponent::learning_rate_, Convolutional1dComponent::OutputDim(), and kaldi::nnet2::ParseFromString().

Referenced by kaldi::nnet2::UnitTestConvolutional1dComponent().

                                                             {
   std::string orig_args(args);
   bool ok = true, appended_conv = false;
   BaseFloat learning_rate = learning_rate_;
   std::string matrix_filename;
   int32 input_dim = -1, output_dim = -1;
   int32 patch_dim = -1, patch_step = -1, patch_stride = -1;
   ParseFromString("learning-rate", &args, &learning_rate);
   ParseFromString("appended-conv", &args, &appended_conv);
   ok = ok && ParseFromString("patch-dim", &args, &patch_dim);
   ok = ok && ParseFromString("patch-step", &args, &patch_step);
   ok = ok && ParseFromString("patch-stride", &args, &patch_stride);
   if (ParseFromString("matrix", &args, &matrix_filename)) {
     // initialize from prefined parameter matrix
     Init(learning_rate, patch_dim, patch_step, patch_stride,
          matrix_filename, appended_conv);
     if (ParseFromString("input-dim", &args, &input_dim))
       KALDI_ASSERT(input_dim == InputDim() &&
                "input-dim mismatch vs. matrix.");
     if (ParseFromString("output-dim", &args, &output_dim))
             KALDI_ASSERT(output_dim == OutputDim() &&
                      "output-dim mismatch vs. matrix.");
   } else {
     // initialize from configuration
     ok = ok && ParseFromString("input-dim", &args, &input_dim);
     ok = ok && ParseFromString("output-dim", &args, &output_dim);
     BaseFloat param_stddev = 1.0 / std::sqrt(input_dim), bias_stddev = 1.0;
     ParseFromString("param-stddev", &args, &param_stddev);
     ParseFromString("bias-stddev", &args, &bias_stddev);
     Init(learning_rate, input_dim, output_dim, patch_dim,
          patch_step, patch_stride, param_stddev, bias_stddev, appended_conv);
   }
   if (!args.empty())
     KALDI_ERR << "Could not process these elements in initializer: " << args;
   if (!ok)
     KALDI_ERR << "Bad initializer " << orig_args;
 }

◆ InputDim()

int32 InputDim ( ) const

virtual

Get size of input vectors.

Implements Component.

Definition at line 3655 of file nnet-component.cc.

References Convolutional1dComponent::filter_params_, Convolutional1dComponent::patch_dim_, and Convolutional1dComponent::patch_stride_.

Referenced by Convolutional1dComponent::Backprop(), Convolutional1dComponent::Info(), Convolutional1dComponent::InitFromString(), Convolutional1dComponent::Propagate(), and Convolutional1dComponent::Update().

                                                {
   int32 filter_dim = filter_params_.NumCols();
   int32 num_splice = filter_dim / patch_dim_;
   return patch_stride_ * num_splice;
 }

◆ LinearParams()

const CuMatrix<BaseFloat>& LinearParams ( )

inline

Definition at line 1768 of file nnet-component.h.

1768 { return filter_params_; }

kaldi::nnet2::Convolutional1dComponent::filter_params_

CuMatrix< BaseFloat > filter_params_

Definition: nnet-component.h:1785

◆ operator=()

const Convolutional1dComponent& operator= ( const Convolutional1dComponent & other )

private

◆ OutputDim()

int32 OutputDim ( ) const

virtual

Get size of output vectors.

Implements Component.

Definition at line 3662 of file nnet-component.cc.

References Convolutional1dComponent::filter_params_, Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, and Convolutional1dComponent::patch_stride_.

Referenced by Convolutional1dComponent::Info(), and Convolutional1dComponent::InitFromString().

                                                 {
   int32 num_filters = filter_params_.NumRows();
   int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
   return num_patches * num_filters;
 }

◆ PerturbParams()

void PerturbParams ( BaseFloat stddev )

virtual

We introduce a new virtual function that only applies to class UpdatableComponent.

This is used in testing.

Implements UpdatableComponent.

Definition at line 4140 of file nnet-component.cc.

References Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, CuVectorBase< Real >::SetRandn(), and CuMatrixBase< Real >::SetRandn().

                                                              {
   CuMatrix<BaseFloat> temp_filter_params(filter_params_);
   temp_filter_params.SetRandn();
   filter_params_.AddMat(stddev, temp_filter_params);
 
   CuVector<BaseFloat> temp_bias_params(bias_params_);
   temp_bias_params.SetRandn();
   bias_params_.AddVec(stddev, temp_bias_params);
 }

◆ Propagate()

void Propagate	(	const ChunkInfo &	in_info,
		const ChunkInfo &	out_info,
		const CuMatrixBase< BaseFloat > &	in,
		CuMatrixBase< BaseFloat > *	out
	)		const

virtual

Perform forward pass propagation Input->Output.

Each row is one frame or training example. Interpreted as "num_chunks" equally sized chunks of frames; this only matters for layers that do things like context splicing. Typically this variable will either be 1 (when we're processing a single contiguous chunk of data) or will be the same as in.NumFrames(), but other values are possible if some layers do splicing.

Buffer of reshaped inputs: 1row = vectorized rectangular feature patches 1col = dim over speech frames,

Implements Component.

Definition at line 3819 of file nnet-component.cc.

References CuMatrixBase< Real >::AddVecToRows(), Convolutional1dComponent::appended_conv_, Convolutional1dComponent::bias_params_, ChunkInfo::CheckSize(), CuMatrixBase< Real >::ColRange(), CuMatrixBase< Real >::CopyCols(), rnnlm::d, Convolutional1dComponent::filter_params_, Convolutional1dComponent::InputDim(), KALDI_ASSERT, kaldi::kNoTrans, kaldi::kTrans, kaldi::kUndefined, ChunkInfo::NumChunks(), CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, and Convolutional1dComponent::patch_stride_.

                                                                              {
   in_info.CheckSize(in);
   out_info.CheckSize(*out);
   KALDI_ASSERT(in_info.NumChunks() == out_info.NumChunks());
 
   // dims
   int32 num_splice = InputDim() / patch_stride_;
   int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
   int32 num_filters = filter_params_.NumRows();
   int32 num_frames = in.NumRows();
   int32 filter_dim = filter_params_.NumCols();
 
   CuMatrix<BaseFloat> patches(num_frames, filter_dim * num_patches, kUndefined);
   // column_map is indexed by the column-index of "patches",
   // and the value is the corresponding column-index of "in".
   std::vector<int32> column_map(filter_dim * num_patches);
 
   // build-up a column selection map
   for (int32 patch = 0, index = 0; patch < num_patches; patch++) {
     int32 fstride = patch * patch_step_;
     for (int32 splice = 0; splice < num_splice; splice++) {
       int32 cstride = splice * patch_stride_;
       for (int32 d = 0; d < patch_dim_; d++, index++) {
         if (appended_conv_)
           column_map[index] = (fstride + d) * num_splice + splice;
         else
           column_map[index] = fstride + cstride + d;
       }
     }
   }
   CuArray<int32> cu_cols(column_map);
   patches.CopyCols(in, cu_cols);
 
   //
   // compute filter activations
   //
 
   std::vector<CuSubMatrix<BaseFloat>* > tgt_batch, patch_batch, filter_params_batch;
 
   CuSubMatrix<BaseFloat>* filter_params_elem = new CuSubMatrix<BaseFloat>(
       filter_params_, 0, filter_params_.NumRows(), 0, filter_params_.NumCols());
 
   // form batch in vector container
   for (int32 p = 0; p < num_patches; p++) {
     // form batch in vector container. for filter_params_batch, all elements
     // point to the same copy filter_params_elem
     tgt_batch.push_back(new CuSubMatrix<BaseFloat>(out->ColRange(p * num_filters,
                                                                  num_filters)));
     patch_batch.push_back(new CuSubMatrix<BaseFloat>(
         patches.ColRange(p * filter_dim, filter_dim)));
     filter_params_batch.push_back(filter_params_elem);
 
     tgt_batch[p]->AddVecToRows(1.0, bias_params_, 0.0); // add bias
   }
 
   // apply all filters
   AddMatMatBatched<BaseFloat>(1.0, tgt_batch, patch_batch, kNoTrans,
                               filter_params_batch, kTrans, 1.0);
 
   // release memory
   delete filter_params_elem;
   for (int32 p = 0; p < num_patches; p++) {
     delete tgt_batch[p];
     delete patch_batch[p];
   }
 }

◆ Read()

void Read	(	std::istream &	is,
		bool	binary
	)

virtual

Implements Component.

Definition at line 4059 of file nnet-component.cc.

References Convolutional1dComponent::appended_conv_, Convolutional1dComponent::bias_params_, kaldi::nnet2::ExpectOneOrTwoTokens(), kaldi::ExpectToken(), Convolutional1dComponent::filter_params_, Convolutional1dComponent::is_gradient_, KALDI_ASSERT, UpdatableComponent::learning_rate_, Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, Convolutional1dComponent::patch_stride_, kaldi::ReadBasicType(), kaldi::ReadToken(), and Convolutional1dComponent::Type().

                                                                {
   std::ostringstream ostr_beg, ostr_end;
   ostr_beg << "<" << Type() << ">"; // e.g. "<Convolutional1dComponent>"
   ostr_end << "</" << Type() << ">"; // e.g. "</Convolutional1dComponent>"
   // might not see the "<Convolutional1dComponent>" part because
   // of how ReadNew() works.
   ExpectOneOrTwoTokens(is, binary, ostr_beg.str(), "<LearningRate>");
   ReadBasicType(is, binary, &learning_rate_);
   ExpectToken(is, binary, "<PatchDim>");
   ReadBasicType(is, binary, &patch_dim_);
   ExpectToken(is, binary, "<PatchStep>");
   ReadBasicType(is, binary, &patch_step_);
   ExpectToken(is, binary, "<PatchStride>");
   ReadBasicType(is, binary, &patch_stride_);
   // back-compatibility
   std::string tok;
   ReadToken(is, binary, &tok);
   if (tok == "<AppendedConv>") {
     ReadBasicType(is, binary, &appended_conv_);
     ExpectToken(is, binary, "<FilterParams>");
   } else {
     appended_conv_ = false;
     KALDI_ASSERT(tok == "<FilterParams>");
   }
   filter_params_.Read(is, binary);
   ExpectToken(is, binary, "<BiasParams>");
   bias_params_.Read(is, binary);
   ReadToken(is, binary, &tok);
   if (tok == "<IsGradient>") {
     ReadBasicType(is, binary, &is_gradient_);
     ExpectToken(is, binary, ostr_end.str());
   } else {
     is_gradient_ = false;
     KALDI_ASSERT(tok == ostr_end.str());
   }
 }

◆ RearrangeIndexes()

void RearrangeIndexes	(	const std::vector< std::vector< int32 > > &	in,
		std::vector< std::vector< int32 > > *	out
	)

staticprivate

Definition at line 3948 of file nnet-component.cc.

References rnnlm::i, and rnnlm::j.

Referenced by Convolutional1dComponent::Backprop().

                                                                                    {
   int32 D = in.size();
   int32 L = 0;
   for (int32 i = 0; i < D; i++)
     if (in[i].size() > L)
       L = in[i].size();
   out->resize(L);
   for (int32 i = 0; i < L; i++)
     (*out)[i].resize(D, -1);
   for (int32 i = 0; i < D; i++) {
     for (int32 j = 0; j < in[i].size(); j++) {
       (*out)[j][i] = in[i][j];
     }
   }
 }

◆ Resize()

void Resize	(	int32	input_dim,
		int32	output_dim
	)

Definition at line 3718 of file nnet-component.cc.

References Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, KALDI_ASSERT, Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, and Convolutional1dComponent::patch_stride_.

                                                                        {
   KALDI_ASSERT(input_dim > 0 && output_dim > 0);
   int32 num_splice = input_dim / patch_stride_;
   int32 filter_dim = num_splice * patch_dim_;
   int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
   int32 num_filters = output_dim / num_patches;
   KALDI_ASSERT(input_dim % patch_stride_ == 0);
   KALDI_ASSERT((patch_stride_ - patch_dim_) % patch_step_ == 0);
   KALDI_ASSERT(output_dim % num_patches == 0);
   filter_params_.Resize(num_filters, filter_dim);
   bias_params_.Resize(num_filters);
 }

◆ ReverseIndexes()

void ReverseIndexes	(	const std::vector< int32 > &	forward_indexes,
		int32	input_dim,
		std::vector< std::vector< int32 > > *	backward_indexes
	)

staticprivate

Definition at line 3920 of file nnet-component.cc.

References rnnlm::i, rnnlm::j, and KALDI_ASSERT.

Referenced by Convolutional1dComponent::Backprop().

                                                                                               {
   int32 i, size = forward_indexes.size();
   int32 reserve_size = 2 + size / input_dim;
   backward_indexes->resize(input_dim);
   std::vector<std::vector<int32> >::iterator iter = backward_indexes->begin(),
     end = backward_indexes->end();
   for (; iter != end; ++iter)
     iter->reserve(reserve_size);
   for (int32 j = 0; j < forward_indexes.size(); j++) {
     i = forward_indexes[j];
     KALDI_ASSERT(i < input_dim);
     (*backward_indexes)[i].push_back(j);
   }
 }

◆ Scale()

void Scale ( BaseFloat scale )

virtual

This new virtual function scales the parameters by this amount.

Implements UpdatableComponent.

Definition at line 3894 of file nnet-component.cc.

References Convolutional1dComponent::bias_params_, and Convolutional1dComponent::filter_params_.

                                                     {
   filter_params_.Scale(scale);
   bias_params_.Scale(scale);
 }

◆ SetParams()

void SetParams	(	const VectorBase< BaseFloat > &	bias,
		const MatrixBase< BaseFloat > &	filter
	)

Definition at line 4150 of file nnet-component.cc.

References Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, and KALDI_ASSERT.

                                                                               {
   bias_params_ = bias;
   filter_params_ = filter;
   KALDI_ASSERT(bias_params_.Dim() == filter_params_.NumRows());
 }

◆ SetZero()

void SetZero ( bool treat_as_gradient )

virtual

Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set the learning rate to 1 and make any other changes necessary (there's a variable we have to set for the MixtureProbComponent).

Implements UpdatableComponent.

Definition at line 4048 of file nnet-component.cc.

References Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, Convolutional1dComponent::is_gradient_, and UpdatableComponent::SetLearningRate().

                                                              {
   if (treat_as_gradient) {
     SetLearningRate(1.0);
   }
   filter_params_.SetZero();
   bias_params_.SetZero();
   if (treat_as_gradient) {
     is_gradient_ = true;
   }
 }

◆ Type()

std::string Type ( ) const

inlinevirtual

Implements Component.

Definition at line 1742 of file nnet-component.h.

Referenced by MaxpoolingComponent::Info(), Convolutional1dComponent::Info(), MaxpoolingComponent::InitFromString(), Convolutional1dComponent::Read(), and Convolutional1dComponent::Write().

1742 { return "Convolutional1dComponent"; }

◆ Update()

void Update	(	const CuMatrixBase< BaseFloat > &	in_value,
		const CuMatrixBase< BaseFloat > &	out_deriv
	)

Buffer of reshaped inputs: 1row = vectorized rectangular feature patches 1col = dim over speech frames,

Definition at line 4162 of file nnet-component.cc.

References CuMatrixBase< Real >::AddMatBlocks(), CuVectorBase< Real >::AddRowSumMat(), Convolutional1dComponent::appended_conv_, Convolutional1dComponent::bias_params_, CuMatrixBase< Real >::ColRange(), CuMatrixBase< Real >::CopyCols(), rnnlm::d, Convolutional1dComponent::filter_params_, Convolutional1dComponent::InputDim(), kaldi::kNoTrans, kaldi::kSetZero, kaldi::kTrans, kaldi::kUndefined, UpdatableComponent::learning_rate_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, Convolutional1dComponent::patch_stride_, CuVector< Real >::Resize(), and CuMatrix< Real >::Resize().

Referenced by Convolutional1dComponent::Backprop().

                                                                                 {
   // useful dims
   int32 num_patches = 1 + (patch_stride_ - patch_dim_) / patch_step_;
   int32 num_filters = filter_params_.NumRows();
   int32 filter_dim = filter_params_.NumCols();
   int32 num_frames = in_value.NumRows();
   int32 num_splice = InputDim() / patch_stride_;
   CuMatrix<BaseFloat> filters_grad;
   CuVector<BaseFloat> bias_grad;
 
   CuMatrix<BaseFloat> patches(num_frames, filter_dim * num_patches, kUndefined);
   std::vector<int32> column_map(filter_dim * num_patches);
   for (int32 patch = 0, index = 0; patch < num_patches; patch++) {
     int32 fstride = patch * patch_step_;
     for (int32 splice = 0; splice < num_splice; splice++) {
       int32 cstride = splice * patch_stride_;
       for (int32 d = 0; d < patch_dim_; d++, index++) {
         if (appended_conv_)
           column_map[index] = (fstride + d) * num_splice + splice;
         else
           column_map[index] = fstride + cstride + d;
       }
     }
   }
   CuArray<int32> cu_cols(column_map);
   patches.CopyCols(in_value, cu_cols);
 
   //
   // calculate the gradient
   //
   filters_grad.Resize(num_filters, filter_dim, kSetZero); // reset
   bias_grad.Resize(num_filters, kSetZero); // reset
 
   //
   // use all the patches
   //
 
   // create a single large matrix holding the smaller matrices
   // from the vector container filters_grad_batch along the rows
   CuMatrix<BaseFloat> filters_grad_blocks_batch(
       num_patches * filters_grad.NumRows(), filters_grad.NumCols());
 
   std::vector<CuSubMatrix<BaseFloat>* > filters_grad_batch, diff_patch_batch,
       patch_batch;
   for (int32 p = 0; p < num_patches; p++) {
     // form batch in vector container
     filters_grad_batch.push_back(new CuSubMatrix<BaseFloat>(
         filters_grad_blocks_batch.RowRange(
             p * filters_grad.NumRows(),
             filters_grad.NumRows())));
     diff_patch_batch.push_back(new CuSubMatrix<BaseFloat>(out_deriv.ColRange(
         p * num_filters, num_filters)));
     patch_batch.push_back(new CuSubMatrix<BaseFloat>(patches.ColRange(
         p * filter_dim, filter_dim)));
   }
 
   AddMatMatBatched<BaseFloat>(1.0, filters_grad_batch, diff_patch_batch,
                               kTrans, patch_batch, kNoTrans, 1.0);
 
   // add the row blocks together to filters_grad
   filters_grad.AddMatBlocks(1.0, filters_grad_blocks_batch);
 
   // create a matrix holding the col blocks sum of out_deriv
   CuMatrix<BaseFloat> out_deriv_col_blocks_sum(out_deriv.NumRows(), num_filters);
 
   // add the col blocks together to out_deriv_col_blocks_sum
   out_deriv_col_blocks_sum.AddMatBlocks(1.0, out_deriv);
 
   bias_grad.AddRowSumMat(1.0, out_deriv_col_blocks_sum, 1.0);
 
   // release memory
   for (int32 p = 0; p < num_patches; p++) {
     delete filters_grad_batch[p];
     delete diff_patch_batch[p];
     delete patch_batch[p];
   }
 
   //
   // update
   //
   filter_params_.AddMat(learning_rate_, filters_grad);
   bias_params_.AddVec(learning_rate_, bias_grad);
 }

◆ Write()

void Write	(	std::ostream &	os,
		bool	binary
	)		const

virtual

Write component to stream.

Implements Component.

Definition at line 4096 of file nnet-component.cc.

References Convolutional1dComponent::appended_conv_, Convolutional1dComponent::bias_params_, Convolutional1dComponent::filter_params_, Convolutional1dComponent::is_gradient_, UpdatableComponent::learning_rate_, Convolutional1dComponent::patch_dim_, Convolutional1dComponent::patch_step_, Convolutional1dComponent::patch_stride_, Convolutional1dComponent::Type(), kaldi::WriteBasicType(), and kaldi::WriteToken().

                                                                       {
   std::ostringstream ostr_beg, ostr_end;
   ostr_beg << "<" << Type() << ">"; // e.g. "<Convolutional1dComponent>"
   ostr_end << "</" << Type() << ">"; // e.g. "</Convolutional1dComponent>"
   WriteToken(os, binary, ostr_beg.str());
   WriteToken(os, binary, "<LearningRate>");
   WriteBasicType(os, binary, learning_rate_);
   WriteToken(os, binary, "<PatchDim>");
   WriteBasicType(os, binary, patch_dim_);
   WriteToken(os, binary, "<PatchStep>");
   WriteBasicType(os, binary, patch_step_);
   WriteToken(os, binary, "<PatchStride>");
   WriteBasicType(os, binary, patch_stride_);
   WriteToken(os, binary, "<AppendedConv>");
   WriteBasicType(os, binary, appended_conv_);
   WriteToken(os, binary, "<FilterParams>");
   filter_params_.Write(os, binary);
   WriteToken(os, binary, "<BiasParams>");
   bias_params_.Write(os, binary);
   WriteToken(os, binary, "<IsGradient>");
   WriteBasicType(os, binary, is_gradient_);
   WriteToken(os, binary, ostr_end.str());
 }

Member Data Documentation

◆ appended_conv_

bool appended_conv_

private

Definition at line 1789 of file nnet-component.h.

Referenced by Convolutional1dComponent::Backprop(), Convolutional1dComponent::Convolutional1dComponent(), Convolutional1dComponent::Copy(), Convolutional1dComponent::Info(), Convolutional1dComponent::Init(), Convolutional1dComponent::Propagate(), Convolutional1dComponent::Read(), Convolutional1dComponent::Update(), and Convolutional1dComponent::Write().

◆ bias_params_

CuVector<BaseFloat> bias_params_

private

Definition at line 1786 of file nnet-component.h.

◆ filter_params_

CuMatrix<BaseFloat> filter_params_

private

◆ is_gradient_

bool is_gradient_

private

Definition at line 1790 of file nnet-component.h.

Referenced by Convolutional1dComponent::Convolutional1dComponent(), Convolutional1dComponent::Copy(), Convolutional1dComponent::Read(), Convolutional1dComponent::SetZero(), and Convolutional1dComponent::Write().

◆ patch_dim_

int32 patch_dim_

private

Definition at line 1774 of file nnet-component.h.

Referenced by Convolutional1dComponent::Backprop(), Convolutional1dComponent::Copy(), Convolutional1dComponent::Info(), Convolutional1dComponent::Init(), Convolutional1dComponent::InputDim(), Convolutional1dComponent::OutputDim(), Convolutional1dComponent::Propagate(), Convolutional1dComponent::Read(), Convolutional1dComponent::Resize(), Convolutional1dComponent::Update(), and Convolutional1dComponent::Write().

◆ patch_step_

int32 patch_step_

private

Definition at line 1775 of file nnet-component.h.

Referenced by Convolutional1dComponent::Backprop(), Convolutional1dComponent::Copy(), Convolutional1dComponent::Info(), Convolutional1dComponent::Init(), Convolutional1dComponent::OutputDim(), Convolutional1dComponent::Propagate(), Convolutional1dComponent::Read(), Convolutional1dComponent::Resize(), Convolutional1dComponent::Update(), and Convolutional1dComponent::Write().

◆ patch_stride_

int32 patch_stride_

private

Definition at line 1776 of file nnet-component.h.

Referenced by Convolutional1dComponent::Backprop(), Convolutional1dComponent::Copy(), Convolutional1dComponent::Info(), Convolutional1dComponent::Init(), Convolutional1dComponent::InputDim(), Convolutional1dComponent::OutputDim(), Convolutional1dComponent::Propagate(), Convolutional1dComponent::Read(), Convolutional1dComponent::Resize(), Convolutional1dComponent::Update(), and Convolutional1dComponent::Write().

The documentation for this class was generated from the following files:

nnet2/nnet-component.h
nnet2/nnet-component.cc

Public Member Functions

Private Member Functions

Static Private Member Functions

Private Attributes

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

◆ Convolutional1dComponent() [1/3]

◆ Convolutional1dComponent() [2/3]

◆ Convolutional1dComponent() [3/3]

Member Function Documentation

◆ Add()

◆ Backprop()

◆ BackpropNeedsInput()

◆ BackpropNeedsOutput()

◆ BiasParams()

◆ Copy()

◆ DotProduct()

◆ GetParameterDim()

◆ Info()

◆ Init() [1/2]

◆ Init() [2/2]

◆ InitFromString()

◆ InputDim()

◆ LinearParams()

◆ operator=()

◆ OutputDim()

◆ PerturbParams()

◆ Propagate()

◆ Read()

◆ RearrangeIndexes()

◆ Resize()

◆ ReverseIndexes()

◆ Scale()

◆ SetParams()

◆ SetZero()

◆ Type()

◆ Update()

◆ Write()

Member Data Documentation

◆ appended_conv_

◆ bias_params_

◆ filter_params_

◆ is_gradient_

◆ patch_dim_

◆ patch_step_

◆ patch_stride_