All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
Convolutional2DComponent Class Reference

Convolutional2DComponent implements convolution over 2-axis (frequency and temporal) (i.e. More...

#include <nnet-convolutional-2d-component.h>

Inheritance diagram for Convolutional2DComponent:
Collaboration diagram for Convolutional2DComponent:

Public Member Functions

 Convolutional2DComponent (int32 dim_in, int32 dim_out)
 
 ~Convolutional2DComponent ()
 
ComponentCopy () const
 Copy component (deep copy),. More...
 
ComponentType GetType () const
 Get Type Identification of the component,. More...
 
void InitData (std::istream &is)
 Initialize the content of the component by the 'line' from the prototype,. More...
 
void ReadData (std::istream &is, bool binary)
 Reads the component content. More...
 
void WriteData (std::ostream &os, bool binary) const
 Writes the component content. More...
 
int32 NumParams () const
 Number of trainable parameters,. More...
 
void GetGradient (VectorBase< BaseFloat > *gradient) const
 Get gradient reshaped as a vector,. More...
 
void GetParams (VectorBase< BaseFloat > *params) const
 Get the trainable parameters reshaped as a vector,. More...
 
void SetParams (const VectorBase< BaseFloat > &params)
 Set the trainable parameters from, reshaped as a vector,. More...
 
std::string Info () const
 Print some additional info (after <ComponentName> and the dims),. More...
 
std::string InfoGradient () const
 Print some additional info about gradient (after <...> and dims),. More...
 
void PropagateFnc (const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out)
 Abstract interface for propagation/backpropagation. More...
 
void BackpropagateFnc (const CuMatrixBase< BaseFloat > &in, const CuMatrixBase< BaseFloat > &out, const CuMatrixBase< BaseFloat > &out_diff, CuMatrixBase< BaseFloat > *in_diff)
 Backward pass transformation (to be implemented by descending class...) More...
 
void Update (const CuMatrixBase< BaseFloat > &input, const CuMatrixBase< BaseFloat > &diff)
 Compute gradient and update parameters,. More...
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (int32 input_dim, int32 output_dim)
 
virtual ~UpdatableComponent ()
 
bool IsUpdatable () const
 Check if contains trainable parameters,. More...
 
virtual void SetTrainOptions (const NnetTrainOptions &opts)
 Set the training options to the component,. More...
 
const NnetTrainOptionsGetTrainOptions () const
 Get the training options from the component,. More...
 
virtual void SetLearnRateCoef (BaseFloat val)
 Set the learn-rate coefficient,. More...
 
virtual void SetBiasLearnRateCoef (BaseFloat val)
 Set the learn-rate coefficient for bias,. More...
 
- Public Member Functions inherited from Component
 Component (int32 input_dim, int32 output_dim)
 Generic interface of a component,. More...
 
virtual ~Component ()
 
virtual bool IsMultistream () const
 Check if component has 'Recurrent' interface (trainable and recurrent),. More...
 
int32 InputDim () const
 Get the dimension of the input,. More...
 
int32 OutputDim () const
 Get the dimension of the output,. More...
 
void Propagate (const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out)
 Perform forward-pass propagation 'in' -> 'out',. More...
 
void Backpropagate (const CuMatrixBase< BaseFloat > &in, const CuMatrixBase< BaseFloat > &out, const CuMatrixBase< BaseFloat > &out_diff, CuMatrix< BaseFloat > *in_diff)
 Perform backward-pass propagation 'out_diff' -> 'in_diff'. More...
 
void Write (std::ostream &os, bool binary) const
 Write the component to a stream,. More...
 

Private Attributes

int32 fmap_x_len_
 feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of a patch (i.e. More...
 
int32 fmap_y_len_
 
int32 filt_x_len_
 2D filter dimensions, x_ temporal, y_ spectral, More...
 
int32 filt_y_len_
 
int32 filt_x_step_
 2D shifts along temporal and spectral axis, More...
 
int32 filt_y_step_
 
int32 connect_fmap_
 if connect_fmap_ = 1, then each fmap has num_filt More...
 
CuMatrix< BaseFloatfilters_
 row = vectorized rectangular filter More...
 
CuVector< BaseFloatbias_
 bias for each filter More...
 
CuMatrix< BaseFloatfilters_grad_
 gradient of filters More...
 
CuVector< BaseFloatbias_grad_
 gradient of biases More...
 
std::vector< CuMatrix
< BaseFloat > > 
vectorized_feature_patches_
 Buffer of reshaped inputs: 1row = vectorized rectangular feature patch, 1col = dim over speech frames, std::vector-dim = patch-position. More...
 
std::vector< CuMatrix
< BaseFloat > > 
feature_patch_diffs_
 Buffer for backpropagation: derivatives in the domain of 'vectorized_feature_patches_', 1row = vectorized rectangular feature patch, 1col = dim over speech frames, std::vector-dim = patch-position. More...
 
CuVector< BaseFloatin_diff_summands_
 Auxiliary vector for compensating #summands when backpropagating. More...
 

Additional Inherited Members

- Public Types inherited from Component
enum  ComponentType {
  kUnknown = 0x0, kUpdatableComponent = 0x0100, kAffineTransform, kLinearTransform,
  kConvolutionalComponent, kConvolutional2DComponent, kLstmProjected, kBlstmProjected,
  kRecurrentComponent, kActivationFunction = 0x0200, kSoftmax, kHiddenSoftmax,
  kBlockSoftmax, kSigmoid, kTanh, kParametricRelu,
  kDropout, kLengthNormComponent, kTranform = 0x0400, kRbm,
  kSplice, kCopy, kTranspose, kBlockLinearity,
  kAddShift, kRescale, kKlHmm = 0x0800, kSentenceAveragingComponent,
  kSimpleSentenceAveragingComponent, kAveragePoolingComponent, kAveragePooling2DComponent, kMaxPoolingComponent,
  kMaxPooling2DComponent, kFramePoolingComponent, kParallelComponent, kMultiBasisComponent
}
 Component type identification mechanism,. More...
 
- Static Public Member Functions inherited from Component
static const char * TypeToMarker (ComponentType t)
 Converts component type to marker,. More...
 
static ComponentType MarkerToType (const std::string &s)
 Converts marker to component type (case insensitive),. More...
 
static ComponentInit (const std::string &conf_line)
 Initialize component from a line in config file,. More...
 
static ComponentRead (std::istream &is, bool binary)
 Read the component from a stream (static method),. More...
 
- Static Public Attributes inherited from Component
static const struct key_value kMarkerMap []
 The table with pairs of Component types and markers (defined in nnet-component.cc),. More...
 
- Protected Attributes inherited from UpdatableComponent
NnetTrainOptions opts_
 Option-class with training hyper-parameters,. More...
 
BaseFloat learn_rate_coef_
 Scalar applied to learning rate for weight matrices (to be used in ::Update method),. More...
 
BaseFloat bias_learn_rate_coef_
 Scalar applied to learning rate for bias (to be used in ::Update method),. More...
 
- Protected Attributes inherited from Component
int32 input_dim_
 Data members,. More...
 
int32 output_dim_
 Dimension of the output of the Component,. More...
 

Detailed Description

Convolutional2DComponent implements convolution over 2-axis (frequency and temporal) (i.e.

frequency axis in case we are the 1st component in NN). // We don't do convolution along temporal axis, which simplifies the // implementation (and was not helpful for Tara).

We assume the input featrues are spliced, i.e. each frame is in fact a set of stacked frames, where we can form patches which span over several frequency bands and time axes.

The convolution is done over whole axis with same filters, i.e. we don't use separate filters for different 'regions' of frequency axis.

In order to have a fast implementations, the filters are represented in vectorized form, where each rectangular filter corresponds to a row in a matrix, where all filters are stored. The features are then re-shaped to a set of matrices, where one matrix corresponds to single patch-position, where the filters get applied.

The type of convolution is controled by hyperparameters: x_patch_dim_,y_patch_dim_ ... temporal and frequency axes sizes of the patch (e.g. (9,9) for 9x9 2D filter) x_patch_step_,y_patch_step_ ... temporal and frequencey sizes of shifts in the convolution (e.g. (1,1) 2D filter with 1 step shift in both axes) x_patch_stride_,y_patch_stride_ ... dimension of the feature (maps if inside convolutional layer) (e.g. (11,32) for 32-band 11 frame spliced spectrogram patch) The type of convolution is controlled by hyperparameters: fmap_x_len_, fmap_y_len_ ... dimension of the feature (maps if inside convolutional layer) (e.g. (11,32) for 32-band 11 frame spliced spectrogram patch) filt_x_len_, filt_y_len_ ... temporal and frequency sizes of the filters (e.g. (9,9) for 9x9 2D filter) filt_x_step_, filt_y_step_ ... temporal and frequency sizes of the filters (e.g. (1,1) for 2D-filter, with 1 step shift in both axes)

Due to convolution same weights are used repeateadly, the final gradient is average of all position-specific gradients.

Definition at line 72 of file nnet-convolutional-2d-component.h.

Constructor & Destructor Documentation

Convolutional2DComponent ( int32  dim_in,
int32  dim_out 
)
inline

Definition at line 74 of file nnet-convolutional-2d-component.h.

Referenced by Convolutional2DComponent::Copy().

74  :
75  UpdatableComponent(dim_in, dim_out),
76  fmap_x_len_(0), fmap_y_len_(0),
77  filt_x_len_(0), filt_y_len_(0),
79  connect_fmap_(0)
80  { }
int32 connect_fmap_
if connect_fmap_ = 1, then each fmap has num_filt
int32 filt_x_len_
2D filter dimensions, x_ temporal, y_ spectral,
UpdatableComponent(int32 input_dim, int32 output_dim)
int32 fmap_x_len_
feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of...
int32 filt_x_step_
2D shifts along temporal and spectral axis,

Definition at line 82 of file nnet-convolutional-2d-component.h.

83  { }

Member Function Documentation

void BackpropagateFnc ( const CuMatrixBase< BaseFloat > &  in,
const CuMatrixBase< BaseFloat > &  out,
const CuMatrixBase< BaseFloat > &  out_diff,
CuMatrixBase< BaseFloat > *  in_diff 
)
inlinevirtual

Backward pass transformation (to be implemented by descending class...)

Implements Component.

Definition at line 324 of file nnet-convolutional-2d-component.h.

References CuMatrixBase< Real >::AddMat(), CuMatrixBase< Real >::ColRange(), Convolutional2DComponent::connect_fmap_, CuVectorBase< Real >::Dim(), Convolutional2DComponent::feature_patch_diffs_, Convolutional2DComponent::filt_x_len_, Convolutional2DComponent::filt_x_step_, Convolutional2DComponent::filt_y_len_, Convolutional2DComponent::filt_y_step_, Convolutional2DComponent::filters_, Convolutional2DComponent::fmap_x_len_, Convolutional2DComponent::fmap_y_len_, rnnlm::i, Convolutional2DComponent::in_diff_summands_, Component::input_dim_, CuVectorBase< Real >::InvertElements(), rnnlm::j, KALDI_ASSERT, kaldi::kNoTrans, kaldi::kSetZero, CuMatrixBase< Real >::MulColsVec(), rnnlm::n, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), Component::output_dim_, CuVectorBase< Real >::Range(), and CuVector< Real >::Resize().

327  {
328  // useful dims
329  int32 num_input_fmaps = input_dim_ / (fmap_x_len_ * fmap_y_len_);
330 
331  int32 out_fmap_x_len = (fmap_x_len_ - filt_x_len_)/filt_x_step_ + 1;
332  int32 out_fmap_y_len = (fmap_y_len_ - filt_y_len_)/filt_y_step_ + 1;
333  int32 out_fmap_size = out_fmap_x_len * out_fmap_y_len;
334  int32 num_output_fmaps = output_dim_ / (out_fmap_x_len * out_fmap_y_len);
335  // this is total num_filters,
336  // so each input_fmap has num_filters/num_input_fmaps
337  int32 num_filters = filters_.NumRows();
338  KALDI_ASSERT(num_filters == num_output_fmaps);
339  // int32 filter_size = filt_x_len_*filt_y_len_;
340  int32 num_frames = in.NumRows();
341 
342  for (int32 p = 0; p < out_fmap_size; p++) {
343  feature_patch_diffs_[p].Resize(num_frames, filters_.NumCols(), kSetZero);
344  CuSubMatrix<BaseFloat> out_diff_patch(out_diff.ColRange(p*num_filters, num_filters));
345  feature_patch_diffs_[p].AddMatMat(1.0, out_diff_patch, kNoTrans, filters_, kNoTrans, 0.0);
346  }
347 
348  // compute in_diff_summands_ once
349  if (in_diff_summands_.Dim() == 0) {
351  for (int32 m = 0; m < fmap_x_len_-filt_x_len_+1; m = m+filt_x_step_) {
352  for (int32 n = 0; n < fmap_y_len_-filt_y_len_+1; n = n+filt_y_step_) {
353  int32 st = 0;
354  if (connect_fmap_ == 1) {
355  st = (m * fmap_y_len_ + n) * num_input_fmaps;
356  } else {
357  st = m * fmap_y_len_ * num_input_fmaps + n;
358  }
359  for (int32 i = 0; i < filt_x_len_; i++) {
360  for (int32 j = 0; j < filt_y_len_*num_input_fmaps; j++) {
361  int32 c = 0;
362  if (connect_fmap_ == 1) {
363  c = st + i * (num_input_fmaps * fmap_y_len_) + j;
364  } else {
365  c = st + i * (num_input_fmaps * fmap_y_len_)
366  + (j / num_input_fmaps)
367  + (j % num_input_fmaps) * fmap_y_len_;
368  }
369  // add 1.0
370  in_diff_summands_.Range(c, 1).Add(1.0);
371  }
372  }
373  }
374  }
376  }
377 
378  int32 out_fmap_cnt = 0;
379 
380  for (int32 m = 0; m < fmap_x_len_-filt_x_len_+1; m = m+filt_x_step_) {
381  for (int32 n = 0; n< fmap_y_len_-filt_y_len_+1; n = n+filt_y_step_) {
382  int32 st = 0;
383  if (connect_fmap_ == 1) {
384  st = (m * fmap_y_len_ + n) * num_input_fmaps;
385  } else {
386  st = m * fmap_y_len_ * num_input_fmaps + n;
387  }
388 
389  for (int32 i = 0; i < filt_x_len_; i++) {
390  for (int32 j = 0; j < filt_y_len_*num_input_fmaps; j++) {
391  int32 c = 0;
392  if (connect_fmap_ == 1) {
393  c = st + i *(num_input_fmaps*fmap_y_len_)+j;
394  } else {
395  c = st + i * (num_input_fmaps * fmap_y_len_)
396  + (j / num_input_fmaps)
397  + (j % num_input_fmaps) * fmap_y_len_;
398  }
399  // from which col?
400  CuMatrix<BaseFloat>& diff_mat = feature_patch_diffs_[out_fmap_cnt];
401  CuSubMatrix<BaseFloat> src(diff_mat.ColRange(i*filt_y_len_*num_input_fmaps+j, 1));
402  // to which col?
403  CuSubMatrix<BaseFloat> tgt(in_diff->ColRange(c, 1));
404  tgt.AddMat(1.0, src);
405  }
406  }
407  out_fmap_cnt++;
408  }
409  }
410  // compensate for summands
411  in_diff->MulColsVec(in_diff_summands_);
412  }
CuVector< BaseFloat > in_diff_summands_
Auxiliary vector for compensating #summands when backpropagating.
int32 connect_fmap_
if connect_fmap_ = 1, then each fmap has num_filt
int32 input_dim_
Data members,.
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
void Resize(MatrixIndexT dim, MatrixResizeType t=kSetZero)
Allocate the memory.
Definition: cu-vector.cc:892
int32 filt_x_len_
2D filter dimensions, x_ temporal, y_ spectral,
void MulColsVec(const CuVectorBase< Real > &scale)
scale i'th column by scale[i]
Definition: cu-matrix.cc:750
CuSubMatrix< Real > ColRange(const MatrixIndexT col_offset, const MatrixIndexT num_cols) const
Definition: cu-matrix.h:544
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
struct rnnlm::@11::@12 n
CuSubVector< Real > Range(const MatrixIndexT o, const MatrixIndexT l)
Definition: cu-vector.h:132
int32 output_dim_
Dimension of the output of the Component,.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
int32 fmap_x_len_
feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of...
int32 filt_x_step_
2D shifts along temporal and spectral axis,
std::vector< CuMatrix< BaseFloat > > feature_patch_diffs_
Buffer for backpropagation: derivatives in the domain of 'vectorized_feature_patches_', 1row = vectorized rectangular feature patch, 1col = dim over speech frames, std::vector-dim = patch-position.
Component* Copy ( ) const
inlinevirtual

Copy component (deep copy),.

Implements Component.

Definition at line 85 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::Convolutional2DComponent().

85 { return new Convolutional2DComponent(*this); }
void GetGradient ( VectorBase< BaseFloat > *  gradient) const
inlinevirtual

Get gradient reshaped as a vector,.

Implements UpdatableComponent.

Definition at line 225 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_, VectorBase< Real >::Dim(), CuVectorBase< Real >::Dim(), Convolutional2DComponent::filters_, KALDI_ASSERT, CuMatrixBase< Real >::NumCols(), Convolutional2DComponent::NumParams(), CuMatrixBase< Real >::NumRows(), and VectorBase< Real >::Range().

225  {
226  KALDI_ASSERT(gradient->Dim() == NumParams());
227  int32 filters_num_elem = filters_.NumRows() * filters_.NumCols();
228  gradient->Range(0, filters_num_elem).CopyRowsFromMat(filters_);
229  gradient->Range(filters_num_elem, bias_.Dim()).CopyFromVec(bias_);
230  }
CuVector< BaseFloat > bias_
bias for each filter
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
int32 NumParams() const
Number of trainable parameters,.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:62
SubVector< Real > Range(const MatrixIndexT o, const MatrixIndexT l)
Returns a sub-vector of a vector (a range of elements).
Definition: kaldi-vector.h:92
void GetParams ( VectorBase< BaseFloat > *  params) const
inlinevirtual

Get the trainable parameters reshaped as a vector,.

Implements UpdatableComponent.

Definition at line 232 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_, VectorBase< Real >::Dim(), CuVectorBase< Real >::Dim(), Convolutional2DComponent::filters_, KALDI_ASSERT, CuMatrixBase< Real >::NumCols(), Convolutional2DComponent::NumParams(), CuMatrixBase< Real >::NumRows(), and VectorBase< Real >::Range().

232  {
233  KALDI_ASSERT(params->Dim() == NumParams());
234  int32 filters_num_elem = filters_.NumRows() * filters_.NumCols();
235  params->Range(0, filters_num_elem).CopyRowsFromMat(filters_);
236  params->Range(filters_num_elem, bias_.Dim()).CopyFromVec(bias_);
237  }
CuVector< BaseFloat > bias_
bias for each filter
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
int32 NumParams() const
Number of trainable parameters,.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:62
SubVector< Real > Range(const MatrixIndexT o, const MatrixIndexT l)
Returns a sub-vector of a vector (a range of elements).
Definition: kaldi-vector.h:92
ComponentType GetType ( ) const
inlinevirtual

Get Type Identification of the component,.

Implements Component.

Definition at line 86 of file nnet-convolutional-2d-component.h.

References Component::kConvolutional2DComponent.

std::string Info ( ) const
inlinevirtual

Print some additional info (after <ComponentName> and the dims),.

Reimplemented from Component.

Definition at line 246 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, Convolutional2DComponent::filters_, UpdatableComponent::learn_rate_coef_, kaldi::nnet1::MomentStatistics(), and kaldi::nnet1::ToString().

246  {
247  return std::string("\n filters") + MomentStatistics(filters_) +
248  ", lr-coef " + ToString(learn_rate_coef_) +
249  "\n bias" + MomentStatistics(bias_) +
250  ", lr-coef " + ToString(bias_learn_rate_coef_);
251  }
std::string ToString(const T &t)
Convert basic type to a string (please don't overuse),.
Definition: nnet-utils.h:52
CuVector< BaseFloat > bias_
bias for each filter
std::string MomentStatistics(const VectorBase< Real > &vec)
Get a string with statistics of the data in a vector, so we can print them easily.
Definition: nnet-utils.h:63
BaseFloat bias_learn_rate_coef_
Scalar applied to learning rate for bias (to be used in ::Update method),.
BaseFloat learn_rate_coef_
Scalar applied to learning rate for weight matrices (to be used in ::Update method),.
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
std::string InfoGradient ( ) const
inlinevirtual

Print some additional info about gradient (after <...> and dims),.

Reimplemented from Component.

Definition at line 252 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_grad_, UpdatableComponent::bias_learn_rate_coef_, Convolutional2DComponent::filters_grad_, UpdatableComponent::learn_rate_coef_, kaldi::nnet1::MomentStatistics(), and kaldi::nnet1::ToString().

252  {
253  return std::string("\n filters_grad") + MomentStatistics(filters_grad_) +
254  ", lr-coef " + ToString(learn_rate_coef_) +
255  "\n bias_grad" + MomentStatistics(bias_grad_) +
256  ", lr-coef " + ToString(bias_learn_rate_coef_);
257  }
std::string ToString(const T &t)
Convert basic type to a string (please don't overuse),.
Definition: nnet-utils.h:52
std::string MomentStatistics(const VectorBase< Real > &vec)
Get a string with statistics of the data in a vector, so we can print them easily.
Definition: nnet-utils.h:63
BaseFloat bias_learn_rate_coef_
Scalar applied to learning rate for bias (to be used in ::Update method),.
BaseFloat learn_rate_coef_
Scalar applied to learning rate for weight matrices (to be used in ::Update method),.
CuMatrix< BaseFloat > filters_grad_
gradient of filters
CuVector< BaseFloat > bias_grad_
gradient of biases
void InitData ( std::istream &  is)
inlinevirtual

Initialize the content of the component by the 'line' from the prototype,.

Implements UpdatableComponent.

Definition at line 88 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, Convolutional2DComponent::connect_fmap_, Convolutional2DComponent::filt_x_len_, Convolutional2DComponent::filt_x_step_, Convolutional2DComponent::filt_y_len_, Convolutional2DComponent::filt_y_step_, Convolutional2DComponent::filters_, Convolutional2DComponent::fmap_x_len_, Convolutional2DComponent::fmap_y_len_, Component::input_dim_, KALDI_ASSERT, KALDI_ERR, KALDI_LOG, UpdatableComponent::learn_rate_coef_, Component::output_dim_, kaldi::nnet1::RandGauss(), kaldi::nnet1::RandUniform(), kaldi::ReadBasicType(), kaldi::ReadToken(), CuVector< Real >::Resize(), and CuMatrix< Real >::Resize().

88  {
89  // define options
90  BaseFloat bias_mean = -2.0, bias_range = 2.0, param_stddev = 0.1;
91  // parse config
92  std::string token;
93  while (is >> std::ws, !is.eof()) {
94  ReadToken(is, false, &token);
95  if (token == "<ParamStddev>") ReadBasicType(is, false, &param_stddev);
96  else if (token == "<BiasMean>") ReadBasicType(is, false, &bias_mean);
97  else if (token == "<BiasRange>") ReadBasicType(is, false, &bias_range);
98  else if (token == "<FmapXLen>") ReadBasicType(is, false, &fmap_x_len_);
99  else if (token == "<FmapYLen>") ReadBasicType(is, false, &fmap_y_len_);
100  else if (token == "<FiltXLen>") ReadBasicType(is, false, &filt_x_len_);
101  else if (token == "<FiltYLen>") ReadBasicType(is, false, &filt_y_len_);
102  else if (token == "<FiltXStep>") ReadBasicType(is, false, &filt_x_step_);
103  else if (token == "<FiltYStep>") ReadBasicType(is, false, &filt_y_step_);
104  else if (token == "<ConnectFmap>") ReadBasicType(is, false, &connect_fmap_);
105  else if (token == "<LearnRateCoef>") ReadBasicType(is, false, &learn_rate_coef_);
106  else if (token == "<BiasLearnRateCoef>") ReadBasicType(is, false, &bias_learn_rate_coef_);
107  else KALDI_ERR << "Unknown token " << token << ", a typo in config? "
108  << "(ParamStddev|BiasMean|BiasRange|FmapXLen|FmapYLen|"
109  "FiltXLen|FiltYLen|FiltXStep|FiltYStep|ConnectFmap|"
110  "LearnRateCoef|BiasLearnRateCoef)";
111  }
112 
113  //
114  // Sanity checks:
115  //
116  // input sanity checks
117  // input_dim_ should be multiple of (fmap_x_len_ * fmap_y_len_)
119  int32 num_input_fmaps = input_dim_ / (fmap_x_len_ * fmap_y_len_);
120  KALDI_LOG << "num_input_fmaps " << num_input_fmaps;
121  // check if step is in sync with fmap_len and filt_len
124  int32 out_fmap_x_len = (fmap_x_len_ - filt_x_len_)/filt_x_step_ + 1;
125  int32 out_fmap_y_len = (fmap_y_len_ - filt_y_len_)/filt_y_step_ + 1;
126  // output sanity checks
127  KALDI_ASSERT(output_dim_ % (out_fmap_x_len * out_fmap_y_len) == 0);
128  int32 num_output_fmaps = output_dim_ / (out_fmap_x_len * out_fmap_y_len);
129  KALDI_LOG << "num_output_fmaps " << num_output_fmaps;
130  int32 num_filters = output_dim_/(out_fmap_x_len*out_fmap_y_len);
131  KALDI_LOG << "num_filters " << num_filters;
132 
133  //
134  // Initialize trainable parameters,
135  //
136  filters_.Resize(num_filters, num_input_fmaps*filt_x_len_*filt_y_len_);
137  RandGauss(0.0, param_stddev, &filters_);
138  //
139  bias_.Resize(num_filters);
140  RandUniform(bias_mean, bias_range, &bias_);
141  }
CuVector< BaseFloat > bias_
bias for each filter
int32 connect_fmap_
if connect_fmap_ = 1, then each fmap has num_filt
int32 input_dim_
Data members,.
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
BaseFloat bias_learn_rate_coef_
Scalar applied to learning rate for bias (to be used in ::Update method),.
BaseFloat learn_rate_coef_
Scalar applied to learning rate for weight matrices (to be used in ::Update method),.
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
void RandUniform(BaseFloat mu, BaseFloat range, CuMatrixBase< Real > *mat, struct RandomState *state=NULL)
Fill CuMatrix with random numbers (Uniform distribution): mu = the mean value, range = the 'width' of...
Definition: nnet-utils.h:188
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
void Resize(MatrixIndexT dim, MatrixResizeType t=kSetZero)
Allocate the memory.
Definition: cu-vector.cc:892
int32 filt_x_len_
2D filter dimensions, x_ temporal, y_ spectral,
float BaseFloat
Definition: kaldi-types.h:29
void Resize(MatrixIndexT rows, MatrixIndexT cols, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Allocate the memory.
Definition: cu-matrix.cc:47
#define KALDI_ERR
Definition: kaldi-error.h:127
void RandGauss(BaseFloat mu, BaseFloat sigma, CuMatrixBase< Real > *mat, struct RandomState *state=NULL)
Fill CuMatrix with random numbers (Gaussian distribution): mu = the mean value, sigma = standard devi...
Definition: nnet-utils.h:164
int32 output_dim_
Dimension of the output of the Component,.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
int32 fmap_x_len_
feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of...
#define KALDI_LOG
Definition: kaldi-error.h:133
int32 filt_x_step_
2D shifts along temporal and spectral axis,
int32 NumParams ( ) const
inlinevirtual

Number of trainable parameters,.

Implements UpdatableComponent.

Definition at line 221 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_, CuVectorBase< Real >::Dim(), Convolutional2DComponent::filters_, CuMatrixBase< Real >::NumCols(), and CuMatrixBase< Real >::NumRows().

Referenced by Convolutional2DComponent::GetGradient(), Convolutional2DComponent::GetParams(), and Convolutional2DComponent::SetParams().

221  {
222  return filters_.NumRows()*filters_.NumCols() + bias_.Dim();
223  }
CuVector< BaseFloat > bias_
bias for each filter
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
void PropagateFnc ( const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
)
inlinevirtual

Abstract interface for propagation/backpropagation.

Forward pass transformation (to be implemented by descending class...)

Implements Component.

Definition at line 259 of file nnet-convolutional-2d-component.h.

References CuMatrixBase< Real >::AddVecToRows(), Convolutional2DComponent::bias_, CuMatrixBase< Real >::ColRange(), Convolutional2DComponent::connect_fmap_, Convolutional2DComponent::feature_patch_diffs_, Convolutional2DComponent::filt_x_len_, Convolutional2DComponent::filt_x_step_, Convolutional2DComponent::filt_y_len_, Convolutional2DComponent::filt_y_step_, Convolutional2DComponent::filters_, Convolutional2DComponent::fmap_x_len_, Convolutional2DComponent::fmap_y_len_, rnnlm::i, Component::input_dim_, rnnlm::j, KALDI_ASSERT, kaldi::kNoTrans, kaldi::kTrans, rnnlm::n, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), Component::output_dim_, and Convolutional2DComponent::vectorized_feature_patches_.

260  {
261  // useful dims
262  int32 num_input_fmaps = input_dim_ / (fmap_x_len_ * fmap_y_len_);
263  // int32 inp_fmap_size = fmap_x_len_ * fmap_y_len_;
264  int32 out_fmap_x_len = (fmap_x_len_ - filt_x_len_)/filt_x_step_ + 1;
265  int32 out_fmap_y_len = (fmap_y_len_ - filt_y_len_)/filt_y_step_ + 1;
266  int32 out_fmap_size = out_fmap_x_len*out_fmap_y_len;
267  int32 num_output_fmaps = output_dim_ / (out_fmap_x_len * out_fmap_y_len);
268  // this is total num_filters,
269  // so each input_fmap has size num_filters/num_input_fmaps
270  int32 num_filters = filters_.NumRows();
271  KALDI_ASSERT(num_filters == num_output_fmaps);
272  // int32 filter_size = filt_x_len_*filt_y_len_;
273  int32 num_frames = in.NumRows();
274 
275  // we will need the buffers
276  if (vectorized_feature_patches_.size() == 0) {
277  vectorized_feature_patches_.resize(out_fmap_size);
278  feature_patch_diffs_.resize(out_fmap_size);
279  }
280 
281  for (int32 p = 0; p < out_fmap_size; p++) {
282  vectorized_feature_patches_[p].Resize(num_frames, filters_.NumCols());
283  }
284 
285  // Checked for num_input_fmaps=1, check for num_inp_fmaps>1
286  int32 out_fmap_cnt = 0;
287  for (int32 m = 0; m < fmap_x_len_-filt_x_len_+1; m = m+filt_x_step_) {
288  for (int32 n = 0; n < fmap_y_len_-filt_y_len_+1; n = n+filt_y_step_) {
289  std::vector<int32> column_mask;
290  int32 st = 0;
291  if (connect_fmap_ == 1) {
292  st = (m * fmap_y_len_ + n) * num_input_fmaps;
293  } else {
294  st = m * fmap_y_len_ * num_input_fmaps + n;
295  }
296 
297  for (int32 i = 0; i < filt_x_len_; i++) {
298  for (int32 j = 0; j < filt_y_len_*num_input_fmaps; j++) {
299  int32 c = 0;
300  if (connect_fmap_ == 1) {
301  c = st + i * (num_input_fmaps*fmap_y_len_) + j;
302  } else {
303  c = st + i * (num_input_fmaps * fmap_y_len_)
304  + (j / num_input_fmaps)
305  + (j % num_input_fmaps) * fmap_y_len_;
306  }
307  column_mask.push_back(c);
308  }
309  }
310  CuArray<int32> cu_column_mask(column_mask);
311  vectorized_feature_patches_[out_fmap_cnt].CopyCols(in, cu_column_mask);
312  out_fmap_cnt++;
313  }
314  }
315 
316  for (int32 p = 0; p < out_fmap_size; p++) {
317  CuSubMatrix<BaseFloat> tgt(out->ColRange(p*num_filters, num_filters));
318  tgt.AddVecToRows(1.0, bias_, 0.0);
319  tgt.AddMatMat(1.0, vectorized_feature_patches_[p], kNoTrans, filters_, kTrans, 1.0);
320  }
321  }
CuVector< BaseFloat > bias_
bias for each filter
int32 connect_fmap_
if connect_fmap_ = 1, then each fmap has num_filt
int32 input_dim_
Data members,.
std::vector< CuMatrix< BaseFloat > > vectorized_feature_patches_
Buffer of reshaped inputs: 1row = vectorized rectangular feature patch, 1col = dim over speech frames...
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
int32 filt_x_len_
2D filter dimensions, x_ temporal, y_ spectral,
CuSubMatrix< Real > ColRange(const MatrixIndexT col_offset, const MatrixIndexT num_cols) const
Definition: cu-matrix.h:544
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
struct rnnlm::@11::@12 n
int32 output_dim_
Dimension of the output of the Component,.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
int32 fmap_x_len_
feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of...
int32 filt_x_step_
2D shifts along temporal and spectral axis,
std::vector< CuMatrix< BaseFloat > > feature_patch_diffs_
Buffer for backpropagation: derivatives in the domain of 'vectorized_feature_patches_', 1row = vectorized rectangular feature patch, 1col = dim over speech frames, std::vector-dim = patch-position.
void ReadData ( std::istream &  is,
bool  binary 
)
inlinevirtual

Reads the component content.

Reimplemented from Component.

Definition at line 143 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, Convolutional2DComponent::connect_fmap_, kaldi::ExpectToken(), Convolutional2DComponent::filt_x_len_, Convolutional2DComponent::filt_x_step_, Convolutional2DComponent::filt_y_len_, Convolutional2DComponent::filt_y_step_, Convolutional2DComponent::filters_, Convolutional2DComponent::fmap_x_len_, Convolutional2DComponent::fmap_y_len_, Component::input_dim_, KALDI_ASSERT, UpdatableComponent::learn_rate_coef_, Component::output_dim_, CuVector< Real >::Read(), CuMatrix< Real >::Read(), and kaldi::ReadBasicType().

143  {
144  ExpectToken(is, binary, "<LearnRateCoef>");
145  ReadBasicType(is, binary, &learn_rate_coef_);
146  ExpectToken(is, binary, "<BiasLearnRateCoef>");
147  ReadBasicType(is, binary, &bias_learn_rate_coef_);
148  // convolution hyperparameters
149  ExpectToken(is, binary, "<FmapXLen>");
150  ReadBasicType(is, binary, &fmap_x_len_);
151  ExpectToken(is, binary, "<FmapYLen>");
152  ReadBasicType(is, binary, &fmap_y_len_);
153  ExpectToken(is, binary, "<FiltXLen>");
154  ReadBasicType(is, binary, &filt_x_len_);
155  ExpectToken(is, binary, "<FiltYLen>");
156  ReadBasicType(is, binary, &filt_y_len_);
157  ExpectToken(is, binary, "<FiltXStep>");
158  ReadBasicType(is, binary, &filt_x_step_);
159  ExpectToken(is, binary, "<FiltYStep>");
160  ReadBasicType(is, binary, &filt_y_step_);
161  ExpectToken(is, binary, "<ConnectFmap>");
162  ReadBasicType(is, binary, &connect_fmap_);
163 
164  // trainable parameters
165  ExpectToken(is, binary, "<Filters>");
166  filters_.Read(is, binary);
167  ExpectToken(is, binary, "<Bias>");
168  bias_.Read(is, binary);
169 
170  //
171  // Sanity checks:
172  //
173  // input sanity checks
174  // input_dim_ should be multiple of (fmap_x_len_ * fmap_y_len_)
176  // int32 num_input_fmaps = input_dim_ / (fmap_x_len_ * fmap_y_len_);
177  // KALDI_LOG << "num_input_fmaps " << num_input_fmaps;
178  // check if step is in sync with fmap_len and filt_len
181  int32 out_fmap_x_len = (fmap_x_len_ - filt_x_len_)/filt_x_step_ + 1;
182  int32 out_fmap_y_len = (fmap_y_len_ - filt_y_len_)/filt_y_step_ + 1;
183 
184  // output sanity checks
185  KALDI_ASSERT(output_dim_ % (out_fmap_x_len * out_fmap_y_len) == 0);
186  }
CuVector< BaseFloat > bias_
bias for each filter
int32 connect_fmap_
if connect_fmap_ = 1, then each fmap has num_filt
int32 input_dim_
Data members,.
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
BaseFloat bias_learn_rate_coef_
Scalar applied to learning rate for bias (to be used in ::Update method),.
BaseFloat learn_rate_coef_
Scalar applied to learning rate for weight matrices (to be used in ::Update method),.
void Read(std::istream &is, bool binary)
I/O.
Definition: cu-vector.cc:862
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
int32 filt_x_len_
2D filter dimensions, x_ temporal, y_ spectral,
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:188
void Read(std::istream &is, bool binary)
I/O functions.
Definition: cu-matrix.cc:459
int32 output_dim_
Dimension of the output of the Component,.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
int32 fmap_x_len_
feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of...
int32 filt_x_step_
2D shifts along temporal and spectral axis,
void SetParams ( const VectorBase< BaseFloat > &  params)
inlinevirtual

Set the trainable parameters from, reshaped as a vector,.

Implements UpdatableComponent.

Definition at line 239 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_, CuVectorBase< Real >::CopyFromVec(), CuMatrixBase< Real >::CopyRowsFromVec(), VectorBase< Real >::Dim(), CuVectorBase< Real >::Dim(), Convolutional2DComponent::filters_, KALDI_ASSERT, CuMatrixBase< Real >::NumCols(), Convolutional2DComponent::NumParams(), CuMatrixBase< Real >::NumRows(), and VectorBase< Real >::Range().

239  {
240  KALDI_ASSERT(params.Dim() == NumParams());
241  int32 filters_num_elem = filters_.NumRows() * filters_.NumCols();
242  filters_.CopyRowsFromVec(params.Range(0, filters_num_elem));
243  bias_.CopyFromVec(params.Range(filters_num_elem, bias_.Dim()));
244  }
CuVector< BaseFloat > bias_
bias for each filter
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
void CopyRowsFromVec(const CuVectorBase< Real > &v)
This function has two modes of operation.
Definition: cu-matrix.cc:2146
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
int32 NumParams() const
Number of trainable parameters,.
void CopyFromVec(const CuVectorBase< Real > &src)
Copy functions; these will crash if the dimension do not match.
Definition: cu-vector.cc:970
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:62
SubVector< Real > Range(const MatrixIndexT o, const MatrixIndexT l)
Returns a sub-vector of a vector (a range of elements).
Definition: kaldi-vector.h:92
void Update ( const CuMatrixBase< BaseFloat > &  input,
const CuMatrixBase< BaseFloat > &  diff 
)
inlinevirtual

Compute gradient and update parameters,.

Implements UpdatableComponent.

Definition at line 415 of file nnet-convolutional-2d-component.h.

References CuMatrixBase< Real >::AddMat(), CuMatrixBase< Real >::AddMatMat(), CuVectorBase< Real >::AddRowSumMat(), CuVectorBase< Real >::AddVec(), Convolutional2DComponent::bias_, Convolutional2DComponent::bias_grad_, UpdatableComponent::bias_learn_rate_coef_, CuMatrixBase< Real >::ColRange(), Convolutional2DComponent::filt_x_len_, Convolutional2DComponent::filt_x_step_, Convolutional2DComponent::filt_y_len_, Convolutional2DComponent::filt_y_step_, Convolutional2DComponent::filters_, Convolutional2DComponent::filters_grad_, Convolutional2DComponent::fmap_x_len_, Convolutional2DComponent::fmap_y_len_, KALDI_ASSERT, kaldi::kNoTrans, kaldi::kSetZero, kaldi::kTrans, NnetTrainOptions::learn_rate, UpdatableComponent::learn_rate_coef_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), UpdatableComponent::opts_, Component::output_dim_, CuVector< Real >::Resize(), CuMatrix< Real >::Resize(), CuVectorBase< Real >::Scale(), CuMatrixBase< Real >::Scale(), and Convolutional2DComponent::vectorized_feature_patches_.

416  {
417  // useful dims,
418  int32 out_fmap_x_len = (fmap_x_len_ - filt_x_len_)/filt_x_step_ + 1;
419  int32 out_fmap_y_len = (fmap_y_len_ - filt_y_len_)/filt_y_step_ + 1;
420  int32 out_fmap_size = out_fmap_x_len * out_fmap_y_len;
421  int32 num_output_fmaps = output_dim_ / (out_fmap_x_len * out_fmap_y_len);
422 
423  // This is total num_filters,
424  // each input_fmap has num_filters / num_input_fmaps:
425  int32 num_filters = filters_.NumRows();
426  KALDI_ASSERT(num_filters == num_output_fmaps);
427 
428  // we use following hyperparameters from the option class,
429  const BaseFloat lr = opts_.learn_rate;
430 
431  //
432  // calculate the gradient
433  //
436  //
437  for (int32 p = 0; p < out_fmap_size; p++) {
438  CuSubMatrix<BaseFloat> diff_patch(diff.ColRange(p * num_filters, num_filters));
440  bias_grad_.AddRowSumMat(1.0, diff_patch, 1.0);
441  }
442  // scale
443  filters_grad_.Scale(1.0/num_output_fmaps);
444  bias_grad_.Scale(1.0/num_output_fmaps);
445 
446  //
447  // update
448  //
451  }
void Scale(Real value)
Definition: cu-vector.cc:1105
CuVector< BaseFloat > bias_
bias for each filter
NnetTrainOptions opts_
Option-class with training hyper-parameters,.
std::vector< CuMatrix< BaseFloat > > vectorized_feature_patches_
Buffer of reshaped inputs: 1row = vectorized rectangular feature patch, 1col = dim over speech frames...
BaseFloat bias_learn_rate_coef_
Scalar applied to learning rate for bias (to be used in ::Update method),.
void Scale(Real value)
Definition: cu-matrix.cc:608
BaseFloat learn_rate_coef_
Scalar applied to learning rate for weight matrices (to be used in ::Update method),.
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
void Resize(MatrixIndexT dim, MatrixResizeType t=kSetZero)
Allocate the memory.
Definition: cu-vector.cc:892
int32 filt_x_len_
2D filter dimensions, x_ temporal, y_ spectral,
CuMatrix< BaseFloat > filters_grad_
gradient of filters
void AddRowSumMat(Real alpha, const CuMatrixBase< Real > &mat, Real beta=1.0)
Sum the rows of the matrix, add to vector.
Definition: cu-vector.cc:1166
float BaseFloat
Definition: kaldi-types.h:29
CuSubMatrix< Real > ColRange(const MatrixIndexT col_offset, const MatrixIndexT num_cols) const
Definition: cu-matrix.h:544
void Resize(MatrixIndexT rows, MatrixIndexT cols, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Allocate the memory.
Definition: cu-matrix.cc:47
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
void AddMatMat(Real alpha, const CuMatrixBase< Real > &A, MatrixTransposeType transA, const CuMatrixBase< Real > &B, MatrixTransposeType transB, Real beta)
C = alpha * A(^T)*B(^T) + beta * C.
Definition: cu-matrix.cc:1142
int32 output_dim_
Dimension of the output of the Component,.
void AddMat(Real alpha, const CuMatrixBase< Real > &A, MatrixTransposeType trans=kNoTrans)
*this += alpha * A
Definition: cu-matrix.cc:939
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
int32 fmap_x_len_
feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of...
void AddVec(Real alpha, const CuVectorBase< Real > &vec, Real beta=1.0)
Definition: cu-vector.cc:1126
int32 filt_x_step_
2D shifts along temporal and spectral axis,
CuVector< BaseFloat > bias_grad_
gradient of biases
void WriteData ( std::ostream &  os,
bool  binary 
) const
inlinevirtual

Writes the component content.

Reimplemented from Component.

Definition at line 188 of file nnet-convolutional-2d-component.h.

References Convolutional2DComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, Convolutional2DComponent::connect_fmap_, Convolutional2DComponent::filt_x_len_, Convolutional2DComponent::filt_x_step_, Convolutional2DComponent::filt_y_len_, Convolutional2DComponent::filt_y_step_, Convolutional2DComponent::filters_, Convolutional2DComponent::fmap_x_len_, Convolutional2DComponent::fmap_y_len_, UpdatableComponent::learn_rate_coef_, CuVector< Real >::Write(), CuMatrixBase< Real >::Write(), kaldi::WriteBasicType(), and kaldi::WriteToken().

188  {
189  WriteToken(os, binary, "<LearnRateCoef>");
190  WriteBasicType(os, binary, learn_rate_coef_);
191  WriteToken(os, binary, "<BiasLearnRateCoef>");
193  if (!binary) os << "\n";
194 
195  // convolution hyperparameters
196  WriteToken(os, binary, "<FmapXLen>");
197  WriteBasicType(os, binary, fmap_x_len_);
198  WriteToken(os, binary, "<FmapYLen>");
199  WriteBasicType(os, binary, fmap_y_len_);
200  WriteToken(os, binary, "<FiltXLen>");
201  WriteBasicType(os, binary, filt_x_len_);
202  WriteToken(os, binary, "<FiltYLen>");
203  WriteBasicType(os, binary, filt_y_len_);
204  WriteToken(os, binary, "<FiltXStep>");
205  WriteBasicType(os, binary, filt_x_step_);
206  WriteToken(os, binary, "<FiltYStep>");
207  WriteBasicType(os, binary, filt_y_step_);
208  WriteToken(os, binary, "<ConnectFmap>");
209  WriteBasicType(os, binary, connect_fmap_);
210  if (!binary) os << "\n";
211 
212  // trainable parameters
213  WriteToken(os, binary, "<Filters>");
214  if (!binary) os << "\n";
215  filters_.Write(os, binary);
216  WriteToken(os, binary, "<Bias>");
217  if (!binary) os << "\n";
218  bias_.Write(os, binary);
219  }
CuVector< BaseFloat > bias_
bias for each filter
int32 connect_fmap_
if connect_fmap_ = 1, then each fmap has num_filt
BaseFloat bias_learn_rate_coef_
Scalar applied to learning rate for bias (to be used in ::Update method),.
BaseFloat learn_rate_coef_
Scalar applied to learning rate for weight matrices (to be used in ::Update method),.
CuMatrix< BaseFloat > filters_
row = vectorized rectangular filter
int32 filt_x_len_
2D filter dimensions, x_ temporal, y_ spectral,
void Write(std::ostream &is, bool binary) const
Definition: cu-vector.cc:872
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34
int32 fmap_x_len_
feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of...
void Write(std::ostream &os, bool binary) const
Definition: cu-matrix.cc:467
int32 filt_x_step_
2D shifts along temporal and spectral axis,

Member Data Documentation

CuVector<BaseFloat> bias_grad_
private
std::vector<CuMatrix<BaseFloat> > feature_patch_diffs_
private

Buffer for backpropagation: derivatives in the domain of 'vectorized_feature_patches_', 1row = vectorized rectangular feature patch, 1col = dim over speech frames, std::vector-dim = patch-position.

Definition at line 486 of file nnet-convolutional-2d-component.h.

Referenced by Convolutional2DComponent::BackpropagateFnc(), and Convolutional2DComponent::PropagateFnc().

CuMatrix<BaseFloat> filters_grad_
private
int32 fmap_x_len_
private

feature maps dimensions (for input x_ is usually splice and y_ is num of fbanks) shift for 2nd dim of a patch (i.e.

frame length before splicing),

Definition at line 457 of file nnet-convolutional-2d-component.h.

Referenced by Convolutional2DComponent::BackpropagateFnc(), Convolutional2DComponent::InitData(), Convolutional2DComponent::PropagateFnc(), Convolutional2DComponent::ReadData(), Convolutional2DComponent::Update(), and Convolutional2DComponent::WriteData().

CuVector<BaseFloat> in_diff_summands_
private

Auxiliary vector for compensating #summands when backpropagating.

Definition at line 489 of file nnet-convolutional-2d-component.h.

Referenced by Convolutional2DComponent::BackpropagateFnc().

std::vector<CuMatrix<BaseFloat> > vectorized_feature_patches_
private

Buffer of reshaped inputs: 1row = vectorized rectangular feature patch, 1col = dim over speech frames, std::vector-dim = patch-position.

Definition at line 478 of file nnet-convolutional-2d-component.h.

Referenced by Convolutional2DComponent::PropagateFnc(), and Convolutional2DComponent::Update().


The documentation for this class was generated from the following file: