ConvolutionComponent Class Reference

WARNING, this component is deprecated in favor of TimeHeightConvolutionComponent, and will be deleted. More...

#include <nnet-combined-component.h>

Inheritance diagram for ConvolutionComponent:
Collaboration diagram for ConvolutionComponent:

Public Types

enum  TensorVectorizationType { kYzx = 0, kZyx = 1 }
 

Public Member Functions

 ConvolutionComponent ()
 
 ConvolutionComponent (const ConvolutionComponent &component)
 
 ConvolutionComponent (const CuMatrixBase< BaseFloat > &filter_params, const CuVectorBase< BaseFloat > &bias_params, int32 input_x_dim, int32 input_y_dim, int32 input_z_dim, int32 filt_x_dim, int32 filt_y_dim, int32 filt_x_step, int32 filt_y_step, TensorVectorizationType input_vectorization, BaseFloat learning_rate)
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update_in, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
void Update (const std::string &debug_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv, const std::vector< CuSubMatrix< BaseFloat > *> &out_deriv_batch)
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void PerturbParams (BaseFloat stddev)
 This function is to be used in testing. More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Computes dot-product between parameters of two instances of a Component. More...
 
virtual int32 NumParameters () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
void SetParams (const VectorBase< BaseFloat > &bias, const MatrixBase< BaseFloat > &filter)
 
const CuVector< BaseFloat > & BiasParams () const
 
const CuMatrix< BaseFloat > & LinearParams () const
 
void Init (int32 input_x_dim, int32 input_y_dim, int32 input_z_dim, int32 filt_x_dim, int32 filt_y_dim, int32 filt_x_step, int32 filt_y_step, int32 num_filters, TensorVectorizationType input_vectorization, BaseFloat param_stddev, BaseFloat bias_stddev)
 
void Init (int32 input_x_dim, int32 input_y_dim, int32 input_z_dim, int32 filt_x_dim, int32 filt_y_dim, int32 filt_x_step, int32 filt_y_step, TensorVectorizationType input_vectorization, std::string matrix_filename)
 
void Resize (int32 input_dim, int32 output_dim)
 
void Update (const std::string &debug_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
virtual void SetUnderlyingLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More...
 
virtual void SetActualLearningRate (BaseFloat lrate)
 Sets the learning rate directly, bypassing learning_rate_factor_. More...
 
virtual void SetAsGradient ()
 Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More...
 
virtual BaseFloat LearningRateFactor ()
 
virtual void SetLearningRateFactor (BaseFloat lrate_factor)
 
void SetUpdatableConfigs (const UpdatableComponent &other)
 
virtual void FreezeNaturalGradient (bool freeze)
 freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...
 
BaseFloat LearningRate () const
 Gets the learning rate to be used in gradient descent. More...
 
BaseFloat MaxChange () const
 Returns the per-component max-change value, which is interpreted as the maximum change (in l2 norm) in parameters that is allowed per minibatch for this component. More...
 
void SetMaxChange (BaseFloat max_change)
 
BaseFloat L2Regularization () const
 Returns the l2 regularization constant, which may be set in any updatable component (usually from the config file). More...
 
void SetL2Regularization (BaseFloat a)
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
virtual void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
 Component ()
 
virtual ~Component ()
 

Private Member Functions

void InputToInputPatches (const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *patches) const
 
void InderivPatchesToInderiv (const CuMatrix< BaseFloat > &in_deriv_patches, CuMatrixBase< BaseFloat > *in_deriv) const
 
const ConvolutionComponentoperator= (const ConvolutionComponent &other)
 

Private Attributes

int32 input_x_dim_
 
int32 input_y_dim_
 
int32 input_z_dim_
 
int32 filt_x_dim_
 
int32 filt_y_dim_
 
int32 filt_x_step_
 
int32 filt_y_step_
 
TensorVectorizationType input_vectorization_
 
CuMatrix< BaseFloatfilter_params_
 
CuVector< BaseFloatbias_params_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from UpdatableComponent
void InitLearningRatesFromConfig (ConfigLine *cfl)
 
std::string ReadUpdatableCommon (std::istream &is, bool binary)
 
void WriteUpdatableCommon (std::ostream &is, bool binary) const
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (typically 0.0..0.01) More...
 
BaseFloat learning_rate_factor_
 learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More...
 
BaseFloat l2_regularize_
 L2 regularization constant. More...
 
bool is_gradient_
 True if this component is to be treated as a gradient rather than as parameters. More...
 
BaseFloat max_change_
 configuration value for imposing max-change More...
 

Detailed Description

WARNING, this component is deprecated in favor of TimeHeightConvolutionComponent, and will be deleted.

ConvolutionalComponent implements 2d-convolution. It uses 3D filters on 3D inputs, but the 3D filters hop only over 2 dimensions as it has same size as the input along the 3rd dimension. Input : A matrix where each row is a vectorized 3D-tensor. The 3D tensor has dimensions x: (e.g. time) y: (e.g. frequency) z: (e.g. channels like features/delta/delta-delta)

The component supports input vectorizations of type zyx and yzx. The default vectorization type is zyx. e.g. for input vectorization of type zyx the input is vectorized by spanning axes z, y and x of the tensor in that order. Given 3d tensor A with sizes (2, 2, 2) along the three dimensions the zyx vectorized input looks like A(0,0,0) A(0,0,1) A(0,1,0) A(0,1,1) A(1,0,0) A(1,0,1) A(1,1,0) A(1,1,1)

Output : The output is also a 3D tensor vectorized in the zyx format. The channel axis (z) in the output corresponds to the output of different filters. The first channel corresponds to the first filter i.e., first row of the filter_params_ matrix.

Note: The component has to support yzx input vectorization as the binaries like add-deltas generate yz vectorized output. These input vectors are concatenated using the Append descriptor across time steps to form a yzx vectorized 3D tensor input. e.g. Append(Offset(input, -1), input, Offset(input, 1))

For information on the hyperparameters and parameters of this component see the variable declarations.

Propagation: ------------ Convolution operation consists of a dot-products between the filter tensor and input tensor patch, for various shifts of filter tensor along the x and y axes input tensor. (Note: there is no shift along z-axis as the filter and input tensor have same size along this axis).

For a particular shift (i,j) of the filter tensor along input tensor dimensions x and y, the elements of the input tensor which overlap with the filter form the input tensor patch. This patch is vectorized in zyx format. All the patches corresponding to various samples in the mini-batch are stacked into a matrix, where each row corresponds to one patch. Let this matrix be represented by X_{i,j}. The dot products with various filters are computed simultaneously by computing the matrix product with the filter_params_ matrix (W) Y_{i,j} = X_{i,j}*W^T. Each row of W corresponds to one filter 3D tensor vectorized in zyx format.

All the matrix products corresponding to various shifts (i,j) of the filter tensor are computed simultaneously using the AddMatMatBatched call of CuMatrixBase class.

BackPropagation: ---------------- Backpropagation to compute the input derivative ( X_{i,j}) consists of the a series of matrix products. {i,j} = {i,j}*W where {i,j} corresponds to the output derivative for a particular shift of the filter.

Once again these matrix products are computed simultaneously.

Update: ------- The weight gradient is computed as = {i,j} (X_{i,j}^T *{i,j})

Definition at line 114 of file nnet-combined-component.h.

Member Enumeration Documentation

◆ TensorVectorizationType

Constructor & Destructor Documentation

◆ ConvolutionComponent() [1/3]

◆ ConvolutionComponent() [2/3]

Definition at line 41 of file nnet-combined-component.cc.

42  :
43  UpdatableComponent(component),
44  input_x_dim_(component.input_x_dim_),
45  input_y_dim_(component.input_y_dim_),
46  input_z_dim_(component.input_z_dim_),
47  filt_x_dim_(component.filt_x_dim_),
48  filt_y_dim_(component.filt_y_dim_),
49  filt_x_step_(component.filt_x_step_),
50  filt_y_step_(component.filt_y_step_),
51  input_vectorization_(component.input_vectorization_),
52  filter_params_(component.filter_params_),
53  bias_params_(component.bias_params_) { }

◆ ConvolutionComponent() [3/3]

ConvolutionComponent ( const CuMatrixBase< BaseFloat > &  filter_params,
const CuVectorBase< BaseFloat > &  bias_params,
int32  input_x_dim,
int32  input_y_dim,
int32  input_z_dim,
int32  filt_x_dim,
int32  filt_y_dim,
int32  filt_x_step,
int32  filt_y_step,
TensorVectorizationType  input_vectorization,
BaseFloat  learning_rate 
)

Definition at line 55 of file nnet-combined-component.cc.

References CuVectorBase< Real >::Dim(), UpdatableComponent::is_gradient_, KALDI_ASSERT, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), and UpdatableComponent::SetUnderlyingLearningRate().

62  :
63  input_x_dim_(input_x_dim),
64  input_y_dim_(input_y_dim),
65  input_z_dim_(input_z_dim),
66  filt_x_dim_(filt_x_dim),
67  filt_y_dim_(filt_y_dim),
68  filt_x_step_(filt_x_step),
69  filt_y_step_(filt_y_step),
70  input_vectorization_(input_vectorization),
71  filter_params_(filter_params),
72  bias_params_(bias_params){
73  KALDI_ASSERT(filter_params.NumRows() == bias_params.Dim() &&
74  bias_params.Dim() != 0);
75  KALDI_ASSERT(filter_params.NumCols() == filt_x_dim * filt_y_dim * input_z_dim);
76  SetUnderlyingLearningRate(learning_rate);
77  is_gradient_ = false;
78 }
bool is_gradient_
True if this component is to be treated as a gradient rather than as parameters.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
virtual void SetUnderlyingLearningRate(BaseFloat lrate)
Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_.

Member Function Documentation

◆ Add()

void Add ( BaseFloat  alpha,
const Component other 
)
virtual

This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters.

– a NonlinearComponent (or another component that stores stats, like BatchNormComponent)– it relates to adding stats. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 348 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, ConvolutionComponent::filter_params_, and KALDI_ASSERT.

Referenced by ConvolutionComponent::Properties(), and LstmNonlinearityComponent::Properties().

348  {
349  const ConvolutionComponent *other =
350  dynamic_cast<const ConvolutionComponent*>(&other_in);
351  KALDI_ASSERT(other != NULL);
352  filter_params_.AddMat(alpha, other->filter_params_);
353  bias_params_.AddVec(alpha, other->bias_params_);
354 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Backprop()

void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 443 of file nnet-combined-component.cc.

References CuMatrixBase< Real >::ColRange(), ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::InderivPatchesToInderiv(), ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, KALDI_ASSERT, kaldi::kNoTrans, kaldi::kSetZero, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), NVTX_RANGE, and ConvolutionComponent::Update().

Referenced by ConvolutionComponent::Properties(), LstmNonlinearityComponent::Properties(), and MaxpoolingComponent::Properties().

450  {
451  NVTX_RANGE("ConvolutionComponent::Backprop");
452  ConvolutionComponent *to_update =
453  dynamic_cast<ConvolutionComponent*>(to_update_in);
454  const int32 num_x_steps = (1 + (input_x_dim_ - filt_x_dim_) / filt_x_step_),
455  num_y_steps = (1 + (input_y_dim_ - filt_y_dim_) / filt_y_step_),
456  num_filters = filter_params_.NumRows(),
457  num_frames = out_deriv.NumRows(),
458  filter_dim = filter_params_.NumCols();
459 
460  KALDI_ASSERT(out_deriv.NumRows() == num_frames &&
461  out_deriv.NumCols() ==
462  (num_filters * num_x_steps * num_y_steps));
463 
464  // Compute inderiv patches
465  CuMatrix<BaseFloat> in_deriv_patches(num_frames,
466  num_x_steps * num_y_steps * filter_dim,
467  kSetZero);
468 
469  std::vector<CuSubMatrix<BaseFloat>* > patch_deriv_batch, out_deriv_batch,
470  filter_params_batch;
471  CuSubMatrix<BaseFloat>* filter_params_elem = new CuSubMatrix<BaseFloat>(
472  filter_params_, 0, filter_params_.NumRows(), 0, filter_params_.NumCols());
473 
474  for (int32 x_step = 0; x_step < num_x_steps; x_step++) {
475  for (int32 y_step = 0; y_step < num_y_steps; y_step++) {
476  int32 patch_number = x_step * num_y_steps + y_step;
477 
478  patch_deriv_batch.push_back(new CuSubMatrix<BaseFloat>(
479  in_deriv_patches.ColRange(
480  patch_number * filter_dim, filter_dim)));
481  out_deriv_batch.push_back(new CuSubMatrix<BaseFloat>(out_deriv.ColRange(
482  patch_number * num_filters, num_filters)));
483  filter_params_batch.push_back(filter_params_elem);
484  }
485  }
486  AddMatMatBatched<BaseFloat>(1.0, patch_deriv_batch,
487  out_deriv_batch, kNoTrans,
488  filter_params_batch, kNoTrans, 0.0);
489 
490  if (in_deriv) {
491  // combine the derivatives from the individual input deriv patches
492  // to compute input deriv matrix
493  InderivPatchesToInderiv(in_deriv_patches, in_deriv);
494  }
495 
496  if (to_update != NULL) {
497  to_update->Update(debug_info, in_value, out_deriv, out_deriv_batch);
498  }
499 
500  // release memory
501  delete filter_params_elem;
502  for (int32 p = 0; p < patch_deriv_batch.size(); p++) {
503  delete patch_deriv_batch[p];
504  delete out_deriv_batch[p];
505  }
506 }
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define NVTX_RANGE(name)
Definition: cu-common.h:143
void InderivPatchesToInderiv(const CuMatrix< BaseFloat > &in_deriv_patches, CuMatrixBase< BaseFloat > *in_deriv) const

◆ BiasParams()

const CuVector<BaseFloat>& BiasParams ( ) const
inline

Definition at line 179 of file nnet-combined-component.h.

References ConvolutionComponent::bias_params_.

179 { return bias_params_; }

◆ Copy()

Component * Copy ( ) const
virtual

Copies component (deep copy).

Implements Component.

Definition at line 654 of file nnet-combined-component.cc.

References ConvolutionComponent::ConvolutionComponent().

Referenced by ConvolutionComponent::Properties(), and LstmNonlinearityComponent::Properties().

654  {
655  ConvolutionComponent *ans = new ConvolutionComponent(*this);
656  return ans;
657 }

◆ DotProduct()

BaseFloat DotProduct ( const UpdatableComponent other) const
virtual

Computes dot-product between parameters of two instances of a Component.

Can be used for computing parameter-norm of an UpdatableComponent.

Implements UpdatableComponent.

Definition at line 647 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, ConvolutionComponent::filter_params_, kaldi::kTrans, kaldi::TraceMatMat(), and kaldi::VecVec().

Referenced by ConvolutionComponent::Properties(), and LstmNonlinearityComponent::Properties().

647  {
648  const ConvolutionComponent *other =
649  dynamic_cast<const ConvolutionComponent*>(&other_in);
650  return TraceMatMat(filter_params_, other->filter_params_, kTrans)
651  + VecVec(bias_params_, other->bias_params_);
652 }
Real TraceMatMat(const MatrixBase< Real > &A, const MatrixBase< Real > &B, MatrixTransposeType trans)
We need to declare this here as it will be a friend function.
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:37

◆ InderivPatchesToInderiv()

void InderivPatchesToInderiv ( const CuMatrix< BaseFloat > &  in_deriv_patches,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
private

Definition at line 387 of file nnet-combined-component.cc.

References CuMatrixBase< Real >::AddCols(), ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_vectorization_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::input_z_dim_, KALDI_ASSERT, ConvolutionComponent::kYzx, ConvolutionComponent::kZyx, CuMatrixBase< Real >::NumCols(), kaldi::nnet3::RearrangeIndexes(), kaldi::nnet3::YzxVectorIndex(), and kaldi::nnet3::ZyxVectorIndex().

Referenced by ConvolutionComponent::Backprop(), and MaxpoolingComponent::Copy().

389  {
390 
391  const int32 num_x_steps = (1 + (input_x_dim_ - filt_x_dim_) / filt_x_step_),
392  num_y_steps = (1 + (input_y_dim_ - filt_y_dim_) / filt_y_step_),
393  filt_x_step = filt_x_step_,
394  filt_y_step = filt_y_step_,
395  filt_x_dim = filt_x_dim_,
396  filt_y_dim = filt_y_dim_,
397  input_x_dim = input_x_dim_,
398  input_y_dim = input_y_dim_,
399  input_z_dim = input_z_dim_,
400  filter_dim = filter_params_.NumCols();
401 
402  // Compute the reverse column_map from the matrix with input
403  // derivative patches to input derivative matrix
404  std::vector<std::vector<int32> > reverse_column_map(in_deriv->NumCols());
405  int32 rev_col_map_size = reverse_column_map.size();
406  for (int32 x_step = 0; x_step < num_x_steps; x_step++) {
407  for (int32 y_step = 0; y_step < num_y_steps; y_step++) {
408  int32 patch_number = x_step * num_y_steps + y_step;
409  int32 patch_start_index = patch_number * filter_dim;
410  for (int32 x = 0, index = patch_start_index; x < filt_x_dim; x++) {
411  for (int32 y = 0; y < filt_y_dim; y++) {
412  for (int32 z = 0; z < input_z_dim; z++, index++) {
413  int32 vector_index;
414  if (input_vectorization_ == kZyx) {
415  vector_index = ZyxVectorIndex(x_step * filt_x_step + x,
416  y_step * filt_y_step + y, z,
417  input_x_dim, input_y_dim,
418  input_z_dim);
419  } else {
421  vector_index = YzxVectorIndex(x_step * filt_x_step + x,
422  y_step * filt_y_step + y, z,
423  input_x_dim, input_y_dim,
424  input_z_dim);
425  }
426  KALDI_ASSERT(vector_index < rev_col_map_size);
427  reverse_column_map[vector_index].push_back(index);
428  }
429  }
430  }
431  }
432  }
433  std::vector<std::vector<int32> > rearranged_column_map;
434  RearrangeIndexes(reverse_column_map, &rearranged_column_map);
435  for (int32 p = 0; p < rearranged_column_map.size(); p++) {
436  CuArray<int32> cu_cols(rearranged_column_map[p]);
437  in_deriv->AddCols(in_deriv_patches, cu_cols);
438  }
439 }
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
int32 YzxVectorIndex(int32 x, int32 y, int32 z, int32 input_x_dim, int32 input_y_dim, int32 input_z_dim)
int32 ZyxVectorIndex(int32 x, int32 y, int32 z, int32 input_x_dim, int32 input_y_dim, int32 input_z_dim)
void RearrangeIndexes(const std::vector< std::vector< int32 > > &in, std::vector< std::vector< int32 > > *out)

◆ Info()

std::string Info ( ) const
virtual

Returns some text-form information about this component, for diagnostics.

Starts with the type of the component. E.g. "SigmoidComponent dim=900", although most components will have much more info.

Reimplemented from UpdatableComponent.

Definition at line 147 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, UpdatableComponent::Info(), ConvolutionComponent::input_vectorization_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::input_z_dim_, and kaldi::nnet3::PrintParameterStats().

Referenced by MaxpoolingComponent::MaxpoolingComponent().

147  {
148  std::ostringstream stream;
149  stream << UpdatableComponent::Info()
150  << ", input-x-dim=" << input_x_dim_
151  << ", input-y-dim=" << input_y_dim_
152  << ", input-z-dim=" << input_z_dim_
153  << ", filt-x-dim=" << filt_x_dim_
154  << ", filt-y-dim=" << filt_y_dim_
155  << ", filt-x-step=" << filt_x_step_
156  << ", filt-y-step=" << filt_y_step_
157  << ", input-vectorization=" << input_vectorization_
158  << ", num-filters=" << filter_params_.NumRows();
159  PrintParameterStats(stream, "filter-params", filter_params_);
160  PrintParameterStats(stream, "bias-params", bias_params_, true);
161  return stream.str();
162 }
virtual std::string Info() const
Returns some text-form information about this component, for diagnostics.
void PrintParameterStats(std::ostringstream &os, const std::string &name, const CuVectorBase< BaseFloat > &params, bool include_mean)
Print to &#39;os&#39; some information about the mean and standard deviation of some parameters, used in Info() functions in nnet-simple-component.cc.
Definition: nnet-parse.cc:157

◆ Init() [1/2]

void Init ( int32  input_x_dim,
int32  input_y_dim,
int32  input_z_dim,
int32  filt_x_dim,
int32  filt_y_dim,
int32  filt_x_step,
int32  filt_y_step,
int32  num_filters,
TensorVectorizationType  input_vectorization,
BaseFloat  param_stddev,
BaseFloat  bias_stddev 
)

Definition at line 94 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_vectorization_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::input_z_dim_, and KALDI_ASSERT.

Referenced by ConvolutionComponent::InitFromConfig(), ConvolutionComponent::LinearParams(), and LstmNonlinearityComponent::Properties().

99  {
100  input_x_dim_ = input_x_dim;
101  input_y_dim_ = input_y_dim;
102  input_z_dim_ = input_z_dim;
103  filt_x_dim_ = filt_x_dim;
104  filt_y_dim_ = filt_y_dim;
105  filt_x_step_ = filt_x_step;
106  filt_y_step_ = filt_y_step;
107  input_vectorization_ = input_vectorization;
110  int32 filter_dim = filt_x_dim_ * filt_y_dim_ * input_z_dim_;
111  filter_params_.Resize(num_filters, filter_dim);
112  bias_params_.Resize(num_filters);
113  KALDI_ASSERT(param_stddev >= 0.0 && bias_stddev >= 0.0);
114  filter_params_.SetRandn();
115  filter_params_.Scale(param_stddev);
116  bias_params_.SetRandn();
117  bias_params_.Scale(bias_stddev);
118 }
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Init() [2/2]

void Init ( int32  input_x_dim,
int32  input_y_dim,
int32  input_z_dim,
int32  filt_x_dim,
int32  filt_y_dim,
int32  filt_x_step,
int32  filt_y_step,
TensorVectorizationType  input_vectorization,
std::string  matrix_filename 
)

Definition at line 121 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_vectorization_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::input_z_dim_, KALDI_ASSERT, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), CuMatrixBase< Real >::Range(), and kaldi::ReadKaldiObject().

126  {
127  input_x_dim_ = input_x_dim;
128  input_y_dim_ = input_y_dim;
129  input_z_dim_ = input_z_dim;
130  filt_x_dim_ = filt_x_dim;
131  filt_y_dim_ = filt_y_dim;
132  filt_x_step_ = filt_x_step;
133  filt_y_step_ = filt_y_step;
134  input_vectorization_ = input_vectorization;
135  CuMatrix<BaseFloat> mat;
136  ReadKaldiObject(matrix_filename, &mat);
137  int32 filter_dim = (filt_x_dim_ * filt_y_dim_ * input_z_dim_);
138  int32 num_filters = mat.NumRows();
139  KALDI_ASSERT(mat.NumCols() == (filter_dim + 1));
140  filter_params_.Resize(num_filters, filter_dim);
141  bias_params_.Resize(num_filters);
142  filter_params_.CopyFromMat(mat.Range(0, num_filters, 0, filter_dim));
143  bias_params_.CopyColFromMat(mat, filter_dim);
144 }
kaldi::int32 int32
void ReadKaldiObject(const std::string &filename, Matrix< float > *m)
Definition: kaldi-io.cc:832
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ InitFromConfig()

void InitFromConfig ( ConfigLine cfl)
virtual

Initialize, from a ConfigLine object.

Parameters
[in]cflA ConfigLine containing any parameters that are needed for initialization. For example: "dim=100 param-stddev=0.1"

Implements Component.

Definition at line 165 of file nnet-combined-component.cc.

References ConfigLine::GetValue(), ConfigLine::HasUnusedValues(), ConvolutionComponent::Init(), UpdatableComponent::InitLearningRatesFromConfig(), KALDI_ERR, ConvolutionComponent::kYzx, ConvolutionComponent::kZyx, ConfigLine::UnusedValues(), and ConfigLine::WholeLine().

Referenced by MaxpoolingComponent::MaxpoolingComponent().

165  {
166  bool ok = true;
167  std::string matrix_filename;
168  int32 input_x_dim = -1, input_y_dim = -1, input_z_dim = -1,
169  filt_x_dim = -1, filt_y_dim = -1,
170  filt_x_step = -1, filt_y_step = -1,
171  num_filters = -1;
172  std::string input_vectorization_order = "zyx";
174  ok = ok && cfl->GetValue("input-x-dim", &input_x_dim);
175  ok = ok && cfl->GetValue("input-y-dim", &input_y_dim);
176  ok = ok && cfl->GetValue("input-z-dim", &input_z_dim);
177  ok = ok && cfl->GetValue("filt-x-dim", &filt_x_dim);
178  ok = ok && cfl->GetValue("filt-y-dim", &filt_y_dim);
179  ok = ok && cfl->GetValue("filt-x-step", &filt_x_step);
180  ok = ok && cfl->GetValue("filt-y-step", &filt_y_step);
181 
182  if (!ok)
183  KALDI_ERR << "Bad initializer " << cfl->WholeLine();
184  // optional argument
185  TensorVectorizationType input_vectorization;
186  cfl->GetValue("input-vectorization-order", &input_vectorization_order);
187  if (input_vectorization_order.compare("zyx") == 0) {
188  input_vectorization = kZyx;
189  } else if (input_vectorization_order.compare("yzx") == 0) {
190  input_vectorization = kYzx;
191  } else {
192  KALDI_ERR << "Unknown or unsupported input vectorization order "
193  << input_vectorization_order
194  << " accepted candidates are 'yzx' and 'zyx'";
195  }
196 
197  if (cfl->GetValue("matrix", &matrix_filename)) {
198  // initialize from prefined parameter matrix
199  Init(input_x_dim, input_y_dim, input_z_dim,
200  filt_x_dim, filt_y_dim,
201  filt_x_step, filt_y_step,
202  input_vectorization,
203  matrix_filename);
204  } else {
205  ok = ok && cfl->GetValue("num-filters", &num_filters);
206  if (!ok)
207  KALDI_ERR << "Bad initializer " << cfl->WholeLine();
208  // initialize from configuration
209  int32 filter_input_dim = filt_x_dim * filt_y_dim * input_z_dim;
210  BaseFloat param_stddev = 1.0 / std::sqrt(filter_input_dim), bias_stddev = 1.0;
211  cfl->GetValue("param-stddev", &param_stddev);
212  cfl->GetValue("bias-stddev", &bias_stddev);
213  Init(input_x_dim, input_y_dim, input_z_dim,
214  filt_x_dim, filt_y_dim, filt_x_step, filt_y_step, num_filters,
215  input_vectorization, param_stddev, bias_stddev);
216  }
217  if (cfl->HasUnusedValues())
218  KALDI_ERR << "Could not process these elements in initializer: "
219  << cfl->UnusedValues();
220  if (!ok)
221  KALDI_ERR << "Bad initializer " << cfl->WholeLine();
222 }
void Init(int32 input_x_dim, int32 input_y_dim, int32 input_z_dim, int32 filt_x_dim, int32 filt_y_dim, int32 filt_x_step, int32 filt_y_step, int32 num_filters, TensorVectorizationType input_vectorization, BaseFloat param_stddev, BaseFloat bias_stddev)
void InitLearningRatesFromConfig(ConfigLine *cfl)
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
#define KALDI_ERR
Definition: kaldi-error.h:147

◆ InputDim()

◆ InputToInputPatches()

void InputToInputPatches ( const CuMatrixBase< BaseFloat > &  in,
CuMatrix< BaseFloat > *  patches 
) const
private

Definition at line 245 of file nnet-combined-component.cc.

References CuMatrixBase< Real >::CopyCols(), ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_vectorization_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::input_z_dim_, KALDI_ASSERT, ConvolutionComponent::kYzx, ConvolutionComponent::kZyx, CuMatrixBase< Real >::NumCols(), kaldi::nnet3::YzxVectorIndex(), and kaldi::nnet3::ZyxVectorIndex().

Referenced by MaxpoolingComponent::Copy(), ConvolutionComponent::Propagate(), and ConvolutionComponent::Update().

247  {
248  int32 num_x_steps = (1 + (input_x_dim_ - filt_x_dim_) / filt_x_step_);
249  int32 num_y_steps = (1 + (input_y_dim_ - filt_y_dim_) / filt_y_step_);
250  const int32 filt_x_step = filt_x_step_,
251  filt_y_step = filt_y_step_,
252  filt_x_dim = filt_x_dim_,
253  filt_y_dim = filt_y_dim_,
254  input_x_dim = input_x_dim_,
255  input_y_dim = input_y_dim_,
256  input_z_dim = input_z_dim_,
257  filter_dim = filter_params_.NumCols();
258 
259  std::vector<int32> column_map(patches->NumCols());
260  int32 column_map_size = column_map.size();
261  for (int32 x_step = 0; x_step < num_x_steps; x_step++) {
262  for (int32 y_step = 0; y_step < num_y_steps; y_step++) {
263  int32 patch_number = x_step * num_y_steps + y_step;
264  int32 patch_start_index = patch_number * filter_dim;
265  for (int32 x = 0, index = patch_start_index; x < filt_x_dim; x++) {
266  for (int32 y = 0; y < filt_y_dim; y++) {
267  for (int32 z = 0; z < input_z_dim; z++, index++) {
268  KALDI_ASSERT(index < column_map_size);
269  if (input_vectorization_ == kZyx) {
270  column_map[index] = ZyxVectorIndex(x_step * filt_x_step + x,
271  y_step * filt_y_step + y, z,
272  input_x_dim, input_y_dim,
273  input_z_dim);
274  } else if (input_vectorization_ == kYzx) {
275  column_map[index] = YzxVectorIndex(x_step * filt_x_step + x,
276  y_step * filt_y_step + y, z,
277  input_x_dim, input_y_dim,
278  input_z_dim);
279  }
280  }
281  }
282  }
283  }
284  }
285  CuArray<int32> cu_cols(column_map);
286  patches->CopyCols(in, cu_cols);
287 }
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
int32 YzxVectorIndex(int32 x, int32 y, int32 z, int32 input_x_dim, int32 input_y_dim, int32 input_z_dim)
int32 ZyxVectorIndex(int32 x, int32 y, int32 z, int32 input_x_dim, int32 input_y_dim, int32 input_z_dim)

◆ LinearParams()

◆ NumParameters()

int32 NumParameters ( ) const
virtual

The following new virtual function returns the total dimension of the parameters in this class.

Reimplemented from UpdatableComponent.

Definition at line 676 of file nnet-combined-component.cc.

References ConvolutionComponent::filter_params_.

Referenced by ConvolutionComponent::Properties(), LstmNonlinearityComponent::Properties(), ConvolutionComponent::UnVectorize(), and ConvolutionComponent::Vectorize().

676  {
677  return (filter_params_.NumCols() + 1) * filter_params_.NumRows();
678 }

◆ operator=()

const ConvolutionComponent& operator= ( const ConvolutionComponent other)
private

◆ OutputDim()

int32 OutputDim ( ) const
virtual

Returns output-dimension of this component.

Implements Component.

Definition at line 86 of file nnet-combined-component.cc.

References ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_x_dim_, and ConvolutionComponent::input_y_dim_.

Referenced by MaxpoolingComponent::MaxpoolingComponent().

◆ PerturbParams()

void PerturbParams ( BaseFloat  stddev)
virtual

This function is to be used in testing.

It adds unit noise times "stddev" to the parameters of the component.

Implements UpdatableComponent.

Definition at line 659 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, ConvolutionComponent::filter_params_, CuVectorBase< Real >::SetRandn(), and CuMatrixBase< Real >::SetRandn().

Referenced by ConvolutionComponent::Properties(), and LstmNonlinearityComponent::Properties().

659  {
660  CuMatrix<BaseFloat> temp_filter_params(filter_params_);
661  temp_filter_params.SetRandn();
662  filter_params_.AddMat(stddev, temp_filter_params);
663 
664  CuVector<BaseFloat> temp_bias_params(bias_params_);
665  temp_bias_params.SetRandn();
666  bias_params_.AddVec(stddev, temp_bias_params);
667 }

◆ Propagate()

void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 292 of file nnet-combined-component.cc.

References CuMatrixBase< Real >::AddVecToRows(), ConvolutionComponent::bias_params_, CuMatrixBase< Real >::ColRange(), ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::InputToInputPatches(), KALDI_ASSERT, kaldi::kNoTrans, kaldi::kTrans, kaldi::kUndefined, CuMatrixBase< Real >::NumCols(), and CuMatrixBase< Real >::NumRows().

Referenced by ConvolutionComponent::Properties(), LstmNonlinearityComponent::Properties(), and MaxpoolingComponent::Properties().

294  {
295  const int32 num_x_steps = (1 + (input_x_dim_ - filt_x_dim_) / filt_x_step_),
296  num_y_steps = (1 + (input_y_dim_ - filt_y_dim_) / filt_y_step_),
297  num_filters = filter_params_.NumRows(),
298  num_frames = in.NumRows(),
299  filter_dim = filter_params_.NumCols();
300  KALDI_ASSERT((*out).NumRows() == num_frames &&
301  (*out).NumCols() == (num_filters * num_x_steps * num_y_steps));
302 
303  CuMatrix<BaseFloat> patches(num_frames,
304  num_x_steps * num_y_steps * filter_dim,
305  kUndefined);
306  InputToInputPatches(in, &patches);
307  CuSubMatrix<BaseFloat>* filter_params_elem = new CuSubMatrix<BaseFloat>(
308  filter_params_, 0, filter_params_.NumRows(), 0, filter_params_.NumCols());
309  std::vector<CuSubMatrix<BaseFloat>* > tgt_batch, patch_batch,
310  filter_params_batch;
311 
312  for (int32 x_step = 0; x_step < num_x_steps; x_step++) {
313  for (int32 y_step = 0; y_step < num_y_steps; y_step++) {
314  int32 patch_number = x_step * num_y_steps + y_step;
315  tgt_batch.push_back(new CuSubMatrix<BaseFloat>(
316  out->ColRange(patch_number * num_filters, num_filters)));
317  patch_batch.push_back(new CuSubMatrix<BaseFloat>(
318  patches.ColRange(patch_number * filter_dim, filter_dim)));
319  filter_params_batch.push_back(filter_params_elem);
320  tgt_batch[patch_number]->AddVecToRows(1.0, bias_params_, 1.0); // add bias
321  }
322  }
323  // apply all filters
324  AddMatMatBatched<BaseFloat>(1.0, tgt_batch, patch_batch,
325  kNoTrans, filter_params_batch,
326  kTrans, 1.0);
327  // release memory
328  delete filter_params_elem;
329  for (int32 p = 0; p < tgt_batch.size(); p++) {
330  delete tgt_batch[p];
331  delete patch_batch[p];
332  }
333  return NULL;
334 }
void InputToInputPatches(const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *patches) const
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Properties()

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)
virtual

Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed.

Implements Component.

Definition at line 585 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, kaldi::nnet3::ExpectToken(), ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_vectorization_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::input_z_dim_, UpdatableComponent::is_gradient_, KALDI_ASSERT, kaldi::ReadBasicType(), kaldi::ReadToken(), and UpdatableComponent::ReadUpdatableCommon().

Referenced by ConvolutionComponent::Properties(), LstmNonlinearityComponent::Properties(), and MaxpoolingComponent::Properties().

585  {
586  ReadUpdatableCommon(is, binary); // Read opening tag and learning rate.
587  ExpectToken(is, binary, "<InputXDim>");
588  ReadBasicType(is, binary, &input_x_dim_);
589  ExpectToken(is, binary, "<InputYDim>");
590  ReadBasicType(is, binary, &input_y_dim_);
591  ExpectToken(is, binary, "<InputZDim>");
592  ReadBasicType(is, binary, &input_z_dim_);
593  ExpectToken(is, binary, "<FiltXDim>");
594  ReadBasicType(is, binary, &filt_x_dim_);
595  ExpectToken(is, binary, "<FiltYDim>");
596  ReadBasicType(is, binary, &filt_y_dim_);
597  ExpectToken(is, binary, "<FiltXStep>");
598  ReadBasicType(is, binary, &filt_x_step_);
599  ExpectToken(is, binary, "<FiltYStep>");
600  ReadBasicType(is, binary, &filt_y_step_);
601  ExpectToken(is, binary, "<InputVectorization>");
602  int32 input_vectorization;
603  ReadBasicType(is, binary, &input_vectorization);
604  input_vectorization_ = static_cast<TensorVectorizationType>(input_vectorization);
605  ExpectToken(is, binary, "<FilterParams>");
606  filter_params_.Read(is, binary);
607  ExpectToken(is, binary, "<BiasParams>");
608  bias_params_.Read(is, binary);
609  std::string tok;
610  ReadToken(is, binary, &tok);
611  if (tok == "<IsGradient>") {
612  ReadBasicType(is, binary, &is_gradient_);
613  ExpectToken(is, binary, "</ConvolutionComponent>");
614  } else {
615  is_gradient_ = false;
616  KALDI_ASSERT(tok == "</ConvolutionComponent>");
617  }
618 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
kaldi::int32 int32
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
std::string ReadUpdatableCommon(std::istream &is, bool binary)
bool is_gradient_
True if this component is to be treated as a gradient rather than as parameters.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Resize()

void Resize ( int32  input_dim,
int32  output_dim 
)

◆ Scale()

void Scale ( BaseFloat  scale)
virtual

This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent.

– a Nonlinear component (or another component that stores stats, like BatchNormComponent)– it relates to scaling activation stats, not parameters. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 337 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, and ConvolutionComponent::filter_params_.

Referenced by ConvolutionComponent::Properties(), and LstmNonlinearityComponent::Properties().

337  {
338  if (scale == 0.0) {
339  filter_params_.SetZero();
340  bias_params_.SetZero();
341  } else {
342  filter_params_.Scale(scale);
343  bias_params_.Scale(scale);
344  }
345 }

◆ SetParams()

void SetParams ( const VectorBase< BaseFloat > &  bias,
const MatrixBase< BaseFloat > &  filter 
)

Definition at line 669 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, ConvolutionComponent::filter_params_, and KALDI_ASSERT.

Referenced by ConvolutionComponent::Properties().

670  {
671  bias_params_ = bias;
672  filter_params_ = filter;
673  KALDI_ASSERT(bias_params_.Dim() == filter_params_.NumRows());
674 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 139 of file nnet-combined-component.h.

139 { return "ConvolutionComponent"; }

◆ UnVectorize()

void UnVectorize ( const VectorBase< BaseFloat > &  params)
virtual

Converts the parameters from vector form.

Reimplemented from UpdatableComponent.

Definition at line 686 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, VectorBase< Real >::Dim(), ConvolutionComponent::filter_params_, KALDI_ASSERT, ConvolutionComponent::NumParameters(), and VectorBase< Real >::Range().

Referenced by ConvolutionComponent::Properties(), and LstmNonlinearityComponent::Properties().

686  {
687  KALDI_ASSERT(params.Dim() == this->NumParameters());
688  int32 num_filter_params = filter_params_.NumCols() * filter_params_.NumRows();
689  filter_params_.CopyRowsFromVec(params.Range(0, num_filter_params));
690  bias_params_.CopyFromVec(params.Range(num_filter_params, bias_params_.Dim()));
691 }
kaldi::int32 int32
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Update() [1/2]

void Update ( const std::string &  debug_info,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
const std::vector< CuSubMatrix< BaseFloat > *> &  out_deriv_batch 
)

Definition at line 511 of file nnet-combined-component.cc.

References CuMatrixBase< Real >::AddMatBlocks(), CuVectorBase< Real >::AddRowSumMat(), ConvolutionComponent::bias_params_, CuMatrixBase< Real >::ColRange(), ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::InputToInputPatches(), KALDI_ASSERT, kaldi::kNoTrans, kaldi::kSetZero, kaldi::kTrans, kaldi::kUndefined, UpdatableComponent::learning_rate_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), CuVector< Real >::Resize(), and CuMatrix< Real >::Resize().

Referenced by ConvolutionComponent::Backprop(), ConvolutionComponent::LinearParams(), and ConvolutionComponent::Properties().

514  {
515  // useful dims
516  const int32 num_x_steps = (1 + (input_x_dim_ - filt_x_dim_) / filt_x_step_),
517  num_y_steps = (1 + (input_y_dim_ - filt_y_dim_) / filt_y_step_),
518  num_filters = filter_params_.NumRows(),
519  num_frames = out_deriv.NumRows(),
520  filter_dim = filter_params_.NumCols();
521  KALDI_ASSERT(out_deriv.NumRows() == num_frames &&
522  out_deriv.NumCols() ==
523  (num_filters * num_x_steps * num_y_steps));
524 
525 
526  CuMatrix<BaseFloat> filters_grad;
527  CuVector<BaseFloat> bias_grad;
528 
529  CuMatrix<BaseFloat> input_patches(num_frames,
530  filter_dim * num_x_steps * num_y_steps,
531  kUndefined);
532  InputToInputPatches(in_value, &input_patches);
533 
534  filters_grad.Resize(num_filters, filter_dim, kSetZero); // reset
535  bias_grad.Resize(num_filters, kSetZero); // reset
536 
537  // create a single large matrix holding the smaller matrices
538  // from the vector container filters_grad_batch along the rows
539  CuMatrix<BaseFloat> filters_grad_blocks_batch(
540  num_x_steps * num_y_steps * filters_grad.NumRows(),
541  filters_grad.NumCols());
542 
543  std::vector<CuSubMatrix<BaseFloat>* > filters_grad_batch, input_patch_batch;
544 
545  for (int32 x_step = 0; x_step < num_x_steps; x_step++) {
546  for (int32 y_step = 0; y_step < num_y_steps; y_step++) {
547  int32 patch_number = x_step * num_y_steps + y_step;
548  filters_grad_batch.push_back(new CuSubMatrix<BaseFloat>(
549  filters_grad_blocks_batch.RowRange(
550  patch_number * filters_grad.NumRows(), filters_grad.NumRows())));
551 
552  input_patch_batch.push_back(new CuSubMatrix<BaseFloat>(
553  input_patches.ColRange(patch_number * filter_dim, filter_dim)));
554  }
555  }
556 
557  AddMatMatBatched<BaseFloat>(1.0, filters_grad_batch, out_deriv_batch, kTrans,
558  input_patch_batch, kNoTrans, 1.0);
559 
560  // add the row blocks together to filters_grad
561  filters_grad.AddMatBlocks(1.0, filters_grad_blocks_batch);
562 
563  // create a matrix holding the col blocks sum of out_deriv
564  CuMatrix<BaseFloat> out_deriv_col_blocks_sum(out_deriv.NumRows(),
565  num_filters);
566 
567  // add the col blocks together to out_deriv_col_blocks_sum
568  out_deriv_col_blocks_sum.AddMatBlocks(1.0, out_deriv);
569 
570  bias_grad.AddRowSumMat(1.0, out_deriv_col_blocks_sum, 1.0);
571 
572  // release memory
573  for (int32 p = 0; p < input_patch_batch.size(); p++) {
574  delete filters_grad_batch[p];
575  delete input_patch_batch[p];
576  }
577 
578  //
579  // update
580  //
581  filter_params_.AddMat(learning_rate_, filters_grad);
582  bias_params_.AddVec(learning_rate_, bias_grad);
583 }
void InputToInputPatches(const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *patches) const
kaldi::int32 int32
BaseFloat learning_rate_
learning rate (typically 0.0..0.01)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Update() [2/2]

void Update ( const std::string &  debug_info,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv 
)

◆ Vectorize()

void Vectorize ( VectorBase< BaseFloat > *  params) const
virtual

Turns the parameters into vector form.

We put the vector form on the CPU, because in the kinds of situations where we do this, we'll tend to use too much memory for the GPU.

Reimplemented from UpdatableComponent.

Definition at line 680 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, VectorBase< Real >::Dim(), ConvolutionComponent::filter_params_, KALDI_ASSERT, ConvolutionComponent::NumParameters(), and VectorBase< Real >::Range().

Referenced by ConvolutionComponent::Properties(), and LstmNonlinearityComponent::Properties().

680  {
681  KALDI_ASSERT(params->Dim() == this->NumParameters());
682  int32 num_filter_params = filter_params_.NumCols() * filter_params_.NumRows();
683  params->Range(0, num_filter_params).CopyRowsFromMat(filter_params_);
684  params->Range(num_filter_params, bias_params_.Dim()).CopyFromVec(bias_params_);
685 }
kaldi::int32 int32
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Implements Component.

Definition at line 620 of file nnet-combined-component.cc.

References ConvolutionComponent::bias_params_, ConvolutionComponent::filt_x_dim_, ConvolutionComponent::filt_x_step_, ConvolutionComponent::filt_y_dim_, ConvolutionComponent::filt_y_step_, ConvolutionComponent::filter_params_, ConvolutionComponent::input_vectorization_, ConvolutionComponent::input_x_dim_, ConvolutionComponent::input_y_dim_, ConvolutionComponent::input_z_dim_, UpdatableComponent::is_gradient_, kaldi::WriteBasicType(), kaldi::WriteToken(), and UpdatableComponent::WriteUpdatableCommon().

Referenced by ConvolutionComponent::Properties(), LstmNonlinearityComponent::Properties(), and MaxpoolingComponent::Properties().

620  {
621  WriteUpdatableCommon(os, binary); // write opening tag and learning rate.
622  WriteToken(os, binary, "<InputXDim>");
623  WriteBasicType(os, binary, input_x_dim_);
624  WriteToken(os, binary, "<InputYDim>");
625  WriteBasicType(os, binary, input_y_dim_);
626  WriteToken(os, binary, "<InputZDim>");
627  WriteBasicType(os, binary, input_z_dim_);
628  WriteToken(os, binary, "<FiltXDim>");
629  WriteBasicType(os, binary, filt_x_dim_);
630  WriteToken(os, binary, "<FiltYDim>");
631  WriteBasicType(os, binary, filt_y_dim_);
632  WriteToken(os, binary, "<FiltXStep>");
633  WriteBasicType(os, binary, filt_x_step_);
634  WriteToken(os, binary, "<FiltYStep>");
635  WriteBasicType(os, binary, filt_y_step_);
636  WriteToken(os, binary, "<InputVectorization>");
637  WriteBasicType(os, binary, static_cast<int32>(input_vectorization_));
638  WriteToken(os, binary, "<FilterParams>");
639  filter_params_.Write(os, binary);
640  WriteToken(os, binary, "<BiasParams>");
641  bias_params_.Write(os, binary);
642  WriteToken(os, binary, "<IsGradient>");
643  WriteBasicType(os, binary, is_gradient_);
644  WriteToken(os, binary, "</ConvolutionComponent>");
645 }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
bool is_gradient_
True if this component is to be treated as a gradient rather than as parameters.
void WriteUpdatableCommon(std::ostream &is, bool binary) const
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation

◆ bias_params_

◆ filt_x_dim_

◆ filt_x_step_

◆ filt_y_dim_

◆ filt_y_step_

◆ filter_params_

◆ input_vectorization_

◆ input_x_dim_

◆ input_y_dim_

◆ input_z_dim_


The documentation for this class was generated from the following files: