ConvolutionalComponent implements convolution over single axis (i.e. More...
#include <nnet-convolutional-component.h>
Public Member Functions | |
ConvolutionalComponent (int32 dim_in, int32 dim_out) | |
~ConvolutionalComponent () | |
Component * | Copy () const |
Copy component (deep copy),. More... | |
ComponentType | GetType () const |
Get Type Identification of the component,. More... | |
void | InitData (std::istream &is) |
Initialize the content of the component by the 'line' from the prototype,. More... | |
void | ReadData (std::istream &is, bool binary) |
Reads the component content. More... | |
void | WriteData (std::ostream &os, bool binary) const |
Writes the component content. More... | |
int32 | NumParams () const |
Number of trainable parameters,. More... | |
void | GetGradient (VectorBase< BaseFloat > *gradient) const |
Get gradient reshaped as a vector,. More... | |
void | GetParams (VectorBase< BaseFloat > *params) const |
Get the trainable parameters reshaped as a vector,. More... | |
void | SetParams (const VectorBase< BaseFloat > ¶ms) |
Set the trainable parameters from, reshaped as a vector,. More... | |
std::string | Info () const |
Print some additional info (after <ComponentName> and the dims),. More... | |
std::string | InfoGradient () const |
Print some additional info about gradient (after <...> and dims),. More... | |
void | PropagateFnc (const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) |
Abstract interface for propagation/backpropagation. More... | |
void | ReverseIndexes (const std::vector< int32 > &forward_indexes, std::vector< std::vector< int32 > > *backward_indexes) |
void | RearrangeIndexes (const std::vector< std::vector< int32 > > &in, std::vector< std::vector< int32 > > *out) |
void | BackpropagateFnc (const CuMatrixBase< BaseFloat > &in, const CuMatrixBase< BaseFloat > &out, const CuMatrixBase< BaseFloat > &out_diff, CuMatrixBase< BaseFloat > *in_diff) |
Backward pass transformation (to be implemented by descending class...) More... | |
void | Update (const CuMatrixBase< BaseFloat > &input, const CuMatrixBase< BaseFloat > &diff) |
Compute gradient and update parameters,. More... | |
Public Member Functions inherited from UpdatableComponent | |
UpdatableComponent (int32 input_dim, int32 output_dim) | |
virtual | ~UpdatableComponent () |
bool | IsUpdatable () const |
Check if contains trainable parameters,. More... | |
virtual void | SetTrainOptions (const NnetTrainOptions &opts) |
Set the training options to the component,. More... | |
const NnetTrainOptions & | GetTrainOptions () const |
Get the training options from the component,. More... | |
virtual void | SetLearnRateCoef (BaseFloat val) |
Set the learn-rate coefficient,. More... | |
virtual void | SetBiasLearnRateCoef (BaseFloat val) |
Set the learn-rate coefficient for bias,. More... | |
Public Member Functions inherited from Component | |
Component (int32 input_dim, int32 output_dim) | |
Generic interface of a component,. More... | |
virtual | ~Component () |
virtual bool | IsMultistream () const |
Check if component has 'Recurrent' interface (trainable and recurrent),. More... | |
int32 | InputDim () const |
Get the dimension of the input,. More... | |
int32 | OutputDim () const |
Get the dimension of the output,. More... | |
void | Propagate (const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out) |
Perform forward-pass propagation 'in' -> 'out',. More... | |
void | Backpropagate (const CuMatrixBase< BaseFloat > &in, const CuMatrixBase< BaseFloat > &out, const CuMatrixBase< BaseFloat > &out_diff, CuMatrix< BaseFloat > *in_diff) |
Perform backward-pass propagation 'out_diff' -> 'in_diff'. More... | |
void | Write (std::ostream &os, bool binary) const |
Write the component to a stream,. More... | |
Private Attributes | |
int32 | patch_dim_ |
number of consecutive inputs, 1st dim of patch More... | |
int32 | patch_step_ |
step of the convolution (i.e. More... | |
int32 | patch_stride_ |
shift for 2nd dim of a patch More... | |
CuMatrix< BaseFloat > | filters_ |
(i.e. frame length before splicing) More... | |
CuVector< BaseFloat > | bias_ |
bias for each filter More... | |
CuMatrix< BaseFloat > | filters_grad_ |
gradient of filters More... | |
CuVector< BaseFloat > | bias_grad_ |
gradient of biases More... | |
BaseFloat | max_norm_ |
limit L2 norm of a neuron weights to positive value More... | |
CuMatrix< BaseFloat > | vectorized_feature_patches_ |
Buffer of reshaped inputs: 1row = vectorized rectangular feature patches, 1col = dim over speech frames Map of input features: std::vector-dim = patch-position. More... | |
std::vector< int32 > | column_map_ |
CuMatrix< BaseFloat > | feature_patch_diffs_ |
Buffer for backpropagation: derivatives in the domain of 'vectorized_feature_patches_', 1row = vectorized rectangular feature patches, 1col = dim over speech frames,. More... | |
Additional Inherited Members | |
Public Types inherited from Component | |
enum | ComponentType { kUnknown = 0x0, kUpdatableComponent = 0x0100, kAffineTransform, kLinearTransform, kConvolutionalComponent, kLstmProjected, kBlstmProjected, kRecurrentComponent, kActivationFunction = 0x0200, kSoftmax, kHiddenSoftmax, kBlockSoftmax, kSigmoid, kTanh, kParametricRelu, kDropout, kLengthNormComponent, kTranform = 0x0400, kRbm, kSplice, kCopy, kTranspose, kBlockLinearity, kAddShift, kRescale, kKlHmm = 0x0800, kSentenceAveragingComponent, kSimpleSentenceAveragingComponent, kAveragePoolingComponent, kMaxPoolingComponent, kFramePoolingComponent, kParallelComponent, kMultiBasisComponent } |
Component type identification mechanism,. More... | |
Static Public Member Functions inherited from Component | |
static const char * | TypeToMarker (ComponentType t) |
Converts component type to marker,. More... | |
static ComponentType | MarkerToType (const std::string &s) |
Converts marker to component type (case insensitive),. More... | |
static Component * | Init (const std::string &conf_line) |
Initialize component from a line in config file,. More... | |
static Component * | Read (std::istream &is, bool binary) |
Read the component from a stream (static method),. More... | |
Static Public Attributes inherited from Component | |
static const struct key_value | kMarkerMap [] |
The table with pairs of Component types and markers (defined in nnet-component.cc),. More... | |
Protected Attributes inherited from UpdatableComponent | |
NnetTrainOptions | opts_ |
Option-class with training hyper-parameters,. More... | |
BaseFloat | learn_rate_coef_ |
Scalar applied to learning rate for weight matrices (to be used in ::Update method),. More... | |
BaseFloat | bias_learn_rate_coef_ |
Scalar applied to learning rate for bias (to be used in ::Update method),. More... | |
Protected Attributes inherited from Component | |
int32 | input_dim_ |
Data members,. More... | |
int32 | output_dim_ |
Dimension of the output of the Component,. More... | |
ConvolutionalComponent implements convolution over single axis (i.e.
frequency axis in case we are the 1st component in NN). We don't do convolution along temporal axis, which simplifies the implementation (and was not helpful for Tara).
We assume the input featrues are spliced, i.e. each frame is in fact a set of stacked frames, where we can form patches which span over several frequency bands and whole time axis.
The convolution is done over whole axis with same filters, i.e. we don't use separate filters for different 'regions' of frequency axis.
In order to have a fast implementations, the filters are represented in vectorized form, where each rectangular filter corresponds to a row in a matrix, where all the filters are stored. The features are then re-shaped to a set of matrices, where one matrix corresponds to single patch-position, where all the filters get applied.
The type of convolution is controled by hyperparameters: patch_dim_ ... frequency axis size of the patch patch_step_ ... size of shift in the convolution patch_stride_ ... shift for 2nd dim of a patch (i.e. frame length before splicing)
Due to convolution same weights are used repeateadly, the final gradient is a sum of all position-specific gradients (the sum was found better than averaging).
Definition at line 66 of file nnet-convolutional-component.h.
|
inline |
Definition at line 68 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::Copy().
|
inline |
Definition at line 76 of file nnet-convolutional-component.h.
|
inlinevirtual |
Backward pass transformation (to be implemented by descending class...)
Implements Component.
Definition at line 368 of file nnet-convolutional-component.h.
References CuMatrixBase< Real >::AddCols(), CuMatrixBase< Real >::AddMatMat(), CuMatrixBase< Real >::ColRange(), ConvolutionalComponent::column_map_, ConvolutionalComponent::feature_patch_diffs_, ConvolutionalComponent::filters_, kaldi::kNoTrans, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, ConvolutionalComponent::RearrangeIndexes(), and ConvolutionalComponent::ReverseIndexes().
|
inlinevirtual |
Copy component (deep copy),.
Implements Component.
Definition at line 79 of file nnet-convolutional-component.h.
References ConvolutionalComponent::ConvolutionalComponent().
|
inlinevirtual |
Get gradient reshaped as a vector,.
Implements UpdatableComponent.
Definition at line 221 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_, VectorBase< Real >::Dim(), ConvolutionalComponent::filters_, KALDI_ASSERT, ConvolutionalComponent::NumParams(), and VectorBase< Real >::Range().
|
inlinevirtual |
Get the trainable parameters reshaped as a vector,.
Implements UpdatableComponent.
Definition at line 228 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_, VectorBase< Real >::Dim(), ConvolutionalComponent::filters_, KALDI_ASSERT, ConvolutionalComponent::NumParams(), and VectorBase< Real >::Range().
|
inlinevirtual |
Get Type Identification of the component,.
Implements Component.
Definition at line 80 of file nnet-convolutional-component.h.
References Component::kConvolutionalComponent.
|
inlinevirtual |
Print some additional info (after <ComponentName> and the dims),.
Reimplemented from Component.
Definition at line 242 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, ConvolutionalComponent::filters_, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, kaldi::nnet1::MomentStatistics(), and kaldi::nnet1::ToString().
|
inlinevirtual |
Print some additional info about gradient (after <...> and dims),.
Reimplemented from Component.
Definition at line 250 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_grad_, UpdatableComponent::bias_learn_rate_coef_, ConvolutionalComponent::filters_grad_, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, kaldi::nnet1::MomentStatistics(), and kaldi::nnet1::ToString().
|
inlinevirtual |
Initialize the content of the component by the 'line' from the prototype,.
Implements UpdatableComponent.
Definition at line 82 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, ConvolutionalComponent::filters_, Component::input_dim_, KALDI_ASSERT, KALDI_ERR, KALDI_LOG, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, Component::output_dim_, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, kaldi::nnet1::RandGauss(), kaldi::nnet1::RandUniform(), kaldi::ReadBasicType(), and kaldi::ReadToken().
|
inlinevirtual |
Number of trainable parameters,.
Implements UpdatableComponent.
Definition at line 217 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_, and ConvolutionalComponent::filters_.
Referenced by ConvolutionalComponent::GetGradient(), ConvolutionalComponent::GetParams(), and ConvolutionalComponent::SetParams().
|
inlinevirtual |
Abstract interface for propagation/backpropagation.
Forward pass transformation (to be implemented by descending class...)
Implements Component.
Definition at line 258 of file nnet-convolutional-component.h.
References CuMatrixBase< Real >::AddVecToRows(), ConvolutionalComponent::bias_, CuMatrixBase< Real >::ColRange(), ConvolutionalComponent::column_map_, rnnlm::d, ConvolutionalComponent::feature_patch_diffs_, ConvolutionalComponent::filters_, Component::input_dim_, kaldi::kNoTrans, kaldi::kSetZero, kaldi::kTrans, kaldi::kUndefined, CuMatrixBase< Real >::NumRows(), ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, and ConvolutionalComponent::vectorized_feature_patches_.
|
inlinevirtual |
Reads the component content.
Reimplemented from Component.
Definition at line 133 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, kaldi::ExpectToken(), ConvolutionalComponent::filters_, Component::input_dim_, KALDI_ASSERT, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, Component::output_dim_, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, kaldi::PeekToken(), and kaldi::ReadBasicType().
|
inline |
Definition at line 351 of file nnet-convolutional-component.h.
References rnnlm::i, and rnnlm::j.
Referenced by ConvolutionalComponent::BackpropagateFnc().
|
inline |
Definition at line 323 of file nnet-convolutional-component.h.
References rnnlm::i, Component::input_dim_, rnnlm::j, and KALDI_ASSERT.
Referenced by ConvolutionalComponent::BackpropagateFnc().
|
inlinevirtual |
Set the trainable parameters from, reshaped as a vector,.
Implements UpdatableComponent.
Definition at line 235 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_, VectorBase< Real >::Dim(), ConvolutionalComponent::filters_, KALDI_ASSERT, ConvolutionalComponent::NumParams(), and VectorBase< Real >::Range().
|
inlinevirtual |
Compute gradient and update parameters,.
Implements UpdatableComponent.
Definition at line 400 of file nnet-convolutional-component.h.
References CuVectorBase< Real >::AddColSumMat(), CuVectorBase< Real >::ApplyFloor(), ConvolutionalComponent::bias_, ConvolutionalComponent::bias_grad_, UpdatableComponent::bias_learn_rate_coef_, CuMatrixBase< Real >::ColRange(), ConvolutionalComponent::filters_, ConvolutionalComponent::filters_grad_, CuVectorBase< Real >::InvertElements(), kaldi::kNoTrans, kaldi::kSetZero, kaldi::kTrans, NnetTrainOptions::learn_rate, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, CuMatrixBase< Real >::MulElements(), UpdatableComponent::opts_, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, CuVectorBase< Real >::Scale(), and ConvolutionalComponent::vectorized_feature_patches_.
|
inlinevirtual |
Writes the component content.
Reimplemented from Component.
Definition at line 188 of file nnet-convolutional-component.h.
References ConvolutionalComponent::bias_, UpdatableComponent::bias_learn_rate_coef_, ConvolutionalComponent::filters_, UpdatableComponent::learn_rate_coef_, ConvolutionalComponent::max_norm_, ConvolutionalComponent::patch_dim_, ConvolutionalComponent::patch_step_, ConvolutionalComponent::patch_stride_, kaldi::WriteBasicType(), and kaldi::WriteToken().
bias for each filter
Definition at line 455 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::GetGradient(), ConvolutionalComponent::GetParams(), ConvolutionalComponent::Info(), ConvolutionalComponent::InitData(), ConvolutionalComponent::NumParams(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::SetParams(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().
gradient of biases
Definition at line 458 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::InfoGradient(), and ConvolutionalComponent::Update().
|
private |
Definition at line 469 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::BackpropagateFnc(), and ConvolutionalComponent::PropagateFnc().
Buffer for backpropagation: derivatives in the domain of 'vectorized_feature_patches_', 1row = vectorized rectangular feature patches, 1col = dim over speech frames,.
Definition at line 476 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::BackpropagateFnc(), and ConvolutionalComponent::PropagateFnc().
(i.e. frame length before splicing)
row = vectorized rectangular filter
Definition at line 454 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::BackpropagateFnc(), ConvolutionalComponent::GetGradient(), ConvolutionalComponent::GetParams(), ConvolutionalComponent::Info(), ConvolutionalComponent::InitData(), ConvolutionalComponent::NumParams(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::SetParams(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().
gradient of filters
Definition at line 457 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::InfoGradient(), and ConvolutionalComponent::Update().
|
private |
limit L2 norm of a neuron weights to positive value
Definition at line 460 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::Info(), ConvolutionalComponent::InfoGradient(), ConvolutionalComponent::InitData(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().
|
private |
number of consecutive inputs, 1st dim of patch
Definition at line 448 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::BackpropagateFnc(), ConvolutionalComponent::InitData(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().
|
private |
step of the convolution (i.e.
shift between 2 patches)
Definition at line 448 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::BackpropagateFnc(), ConvolutionalComponent::InitData(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().
|
private |
shift for 2nd dim of a patch
Definition at line 448 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::BackpropagateFnc(), ConvolutionalComponent::InitData(), ConvolutionalComponent::PropagateFnc(), ConvolutionalComponent::ReadData(), ConvolutionalComponent::Update(), and ConvolutionalComponent::WriteData().
Buffer of reshaped inputs: 1row = vectorized rectangular feature patches, 1col = dim over speech frames Map of input features: std::vector-dim = patch-position.
Definition at line 468 of file nnet-convolutional-component.h.
Referenced by ConvolutionalComponent::PropagateFnc(), and ConvolutionalComponent::Update().