All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
TimeHeightConvolutionComponent Class Reference

TimeHeightConvolutionComponent implements 2-dimensional convolution where one of the dimensions of convolution (which traditionally would be called the width axis) is identified with time (i.e. More...

#include <nnet-convolutional-component.h>

Inheritance diagram for TimeHeightConvolutionComponent:
Collaboration diagram for TimeHeightConvolutionComponent:

Classes

class  PrecomputedIndexes
 

Public Member Functions

 TimeHeightConvolutionComponent ()
 
 TimeHeightConvolutionComponent (const TimeHeightConvolutionComponent &other)
 
virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual
ComponentPrecomputedIndexes
PrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called by. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void PerturbParams (BaseFloat stddev)
 This function is to be used in testing. More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Computes dot-product between parameters of two instances of a Component. More...
 
virtual int32 NumParameters () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
virtual void FreezeNaturalGradient (bool freeze)
 freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...
 
void ScaleLinearParams (BaseFloat alpha)
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
virtual void SetUnderlyingLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More...
 
virtual void SetActualLearningRate (BaseFloat lrate)
 Sets the learning rate directly, bypassing learning_rate_factor_. More...
 
virtual void SetAsGradient ()
 Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More...
 
BaseFloat LearningRate () const
 Gets the learning rate of gradient descent. More...
 
BaseFloat MaxChange () const
 Gets per-component max-change value. More...
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
 Component ()
 
virtual ~Component ()
 

Private Member Functions

void Check ()
 
void ComputeDerived ()
 
void UpdateNaturalGradient (const PrecomputedIndexes &indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
void UpdateSimple (const PrecomputedIndexes &indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
void InitUnit ()
 

Private Attributes

time_height_convolution::ConvolutionModel model_
 
std::vector< int32 > all_time_offsets_
 
std::vector< bool > time_offset_required_
 
CuMatrix< BaseFloatlinear_params_
 
CuVector< BaseFloatbias_params_
 
BaseFloat max_memory_mb_
 
bool use_natural_gradient_
 
BaseFloat num_minibatches_history_
 
OnlineNaturalGradient preconditioner_in_
 
OnlineNaturalGradient preconditioner_out_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from UpdatableComponent
void InitLearningRatesFromConfig (ConfigLine *cfl)
 
std::string ReadUpdatableCommon (std::istream &is, bool binary)
 
void WriteUpdatableCommon (std::ostream &is, bool binary) const
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (typically 0.0..0.01) More...
 
BaseFloat learning_rate_factor_
 learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More...
 
bool is_gradient_
 True if this component is to be treated as a gradient rather than as parameters. More...
 
BaseFloat max_change_
 configuration value for imposing max-change More...
 

Detailed Description

TimeHeightConvolutionComponent implements 2-dimensional convolution where one of the dimensions of convolution (which traditionally would be called the width axis) is identified with time (i.e.

the 't' component of Indexes). For a deeper understanding of how this works, please see convolution.h.

The following are the parameters accepted on the config line, with examples of their values.

Parameters inherited from UpdatableComponent (see comment above declaration of UpdadableComponent in nnet-component-itf.h for details): learning-rate, learning-rate-factor, max-change

Convolution-related parameters:

num-filters-in E.g. num-filters-in=32. Number of input filters (the number of separate versions of the input image). The filter-dim has stride 1 in the input and output vectors, i.e. we order the input as (all-filters-for-height=0, all-filters-for-height=1, etc.) num-filters-out E.g. num-filters-out=64. The number of output filters (the number of separate versions of the output image). As with the input, the filter-dim has stride 1. height-in E.g. height-in=40. The height of the input image. The width is not specified the the model level, as it's identified with "t" and is called the time axis; the width is determined by how many "t" values were available at the input of the network, and how many were requested at the output. height-out E.g. height-out=40. The height of the output image. Will normally be <= (the input height divided by height-subsample-out). height-subsample-out E.g. height-subsample-out=2 (defaults to 1). Subsampling factor on the height axis, e.g. you might set this to 2 if you are doing subsampling on this layer, which would involve discarding every other height increment at the output. There is no corresponding config for the time dimension, as time subsampling is determined by which 't' values you request at the output, together with the values of 'time-offsets' at different layers of the network. height-offsets E.g. height-offsets=-1,0,1 The set of height offsets that contribute to each output pixel: with the values -1,0,1, height 10 at the output would see data from heights 9,10,11 at the input. These values will normally be consecutive. Negative values imply zero-padding on the bottom of the image, since output-height 0 is always defined. Zero-padding at the top of the image is determined in a similar way (e.g. if height-in==height-out and height-offsets=-1,0,1, then there is 1 pixel of padding at the top and bottom of the image). time-offsets E.g. time-offsets=-1,0,1 The time offsets that we require at the input to produce a given output; these are comparable to the offsets used in TDNNs. Note that the time axis is always numbered using an absolute scheme, so that if there is subsampling on the time axis, then later in the network you'll see time-offsets like "-2,0,2" or "-4,0,4". Subsampling on the time axis is not explicitly specified but is implicit based on tracking dependencies. required-time-offsets E.g. required-time-offsets=0 (defaults to the same value as time-offsets). This is a set of time offsets, which if specified must be a nonempty subset of time-offsets; it determines whether zero-padding on the time axis is allowed in cases where there is insufficient input. If not specified it defaults to the same as 'time-offsets', meaning there is no zero-padding on the time axis. Note: for speech tasks we tend to pad on the time axis with repeats of the first or last frame, rather than zero; and this is handled while preparing the data and not by the core components of the nnet3 framework. So for speech tasks we wouldn't normally set this value. max-memory-mb Maximum amount of temporary memory, in megabytes, that may be used as temporary matrices in the convolution computation. default=200.0.

Initialization parameters: param-stddev Standard deviation of the linear parameters of the convolution. Defaults to sqrt(1.0 / (num-filters-in * num-height-offsets * num-time-offsets)), e.g. sqrt(1.0/(64*3*3)) for a 3x3 kernel with 64 input filters; this value will ensure that the output has unit stddev if the input has unit stddev. bias-stddev Standard deviation of bias terms. default=0.0. init-unit Defaults to false. If true, it is required that num-filters-in equal num-filters-out and there should exist a (height, time) offset in the model equal to (0, 0). We will initialize the parameter matrix to be equivalent to the identity transform. In this case, param-stddev is ignored.

Natural-gradient related options are below; you won't normally have to set these.

use-natural-gradient e.g. use-natural-gradient=false (defaults to true). You can set this to false to disable the natural gradient updates (you won't normally want to do this). rank-out Rank used in low-rank-plus-unit estimate of the Fisher-matrix factor that has the dimension (num-rows of the parameter space), which equals num-filters-out. It defaults to the minimum of 80, or half of the number of output filters. rank-in Rank used in low-rank-plus-unit estimate of the Fisher matrix factor which has the dimension (num-cols of the parameter matrix), which has the dimension (num-input-filters * number of time-offsets * number of height-offsets + 1), e.g. num-input-filters * 3 * 3 + 1 for a 3x3 kernel (the +1 is for the bias term). It defaults to the minimum of 80, or half the num-rows of the parameter matrix. [note: I'm considering decreasing this default to e.g. 40 or 20]. num-minibatches-history This is used setting the 'num_samples_history_in' configuration value of the natural gradient object. There is no concept of samples (frames) in the application of natural gradient to the convnet, because we do it all on the rows and columns of the derivative. default=4.0. A larger value means the Fisher matrix is averaged over more minibatches (it's an exponential-decay thing). alpha-out Constant that determines how much we smooth the Fisher-matrix factors with the unit matrix, for the space of dimension num-filters-out. default=4.0. alpha-in Constant that determines how much we smooth the Fisher-matrix factors with the unit matrix, for the space of dimension (num-filters-in * num-time-offsets * num-height-offsets + 1). default=4.0.

Example of a 3x3 kernel with no subsampling, and with zero-padding on both the the height and time axis, and where there has previously been no subsampling on the time axis:

num-filters-in=32 num-filters-out=64 height-in=28 height-out=28 \ height-subsample-out=1 height-offsets=-1,0,1 time-offsets=-1,0,1 \ required-time-offsets=0

Example of a 3x3 kernel with no subsampling, without zero-padding on either axis, and where there has *previously* been 2-fold subsampling on the time axis:

num-filters-in=32 num-filters-out=64 height-in=20 height-out=18 \ height-subsample-out=1 height-offsets=0,1,2 time-offsets=0,2,4

[note: above, the choice to have the time-offsets start at zero rather than be centered is just a choice: it assumes that at the output of the network you would want to request indexes with t=0, while at the input the t values start from zero.]

Example of a 3x3 kernel with subsampling on the height axis, without zero-padding on either axis, and where there has previously been 2-fold subsampling on the time axis:

num-filters-in=32 num-filters-out=64 height-in=20 height-out=9 \ height-subsample-out=2 height-offsets=0,1,2 time-offsets=0,2,4

[note: subsampling on the time axis is not expressed in the layer itself: any time you increase the distance between time-offsets, like changing them from 0,1,2 to 0,2,4, you are effectively subsampling the previous layer– assuming you only request the output at one time value or at multiples of the total subsampling factor.]

Example of a 1x1 kernel:

num-filters-in=64 num-filters-out=64 height-in=20 height-out=20 \ height-subsample-out=1 height-offsets=0 time-offsets=0

Definition at line 207 of file nnet-convolutional-component.h.

Constructor & Destructor Documentation

Definition at line 35 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::Check().

36  :
37  UpdatableComponent(other), // initialize base-class
38  model_(other.model_),
39  all_time_offsets_(other.all_time_offsets_),
40  time_offset_required_(other.time_offset_required_),
41  linear_params_(other.linear_params_),
42  bias_params_(other.bias_params_),
43  max_memory_mb_(other.max_memory_mb_),
44  use_natural_gradient_(other.use_natural_gradient_),
45  num_minibatches_history_(other.num_minibatches_history_),
46  preconditioner_in_(other.preconditioner_in_),
47  preconditioner_out_(other.preconditioner_out_) {
48  Check();
49 }
time_height_convolution::ConvolutionModel model_

Member Function Documentation

void Add ( BaseFloat  alpha,
const Component other 
)
virtual

This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters.

– a NonlinearComponent it relates to adding stats Otherwise it should do nothing.

Reimplemented from Component.

Definition at line 572 of file nnet-convolutional-component.cc.

References CuMatrixBase< Real >::AddMat(), CuVectorBase< Real >::AddVec(), TimeHeightConvolutionComponent::bias_params_, KALDI_ASSERT, and TimeHeightConvolutionComponent::linear_params_.

573  {
574  const TimeHeightConvolutionComponent *other =
575  dynamic_cast<const TimeHeightConvolutionComponent*>(&other_in);
576  KALDI_ASSERT(other != NULL);
577  linear_params_.AddMat(alpha, other->linear_params_);
578  bias_params_.AddVec(alpha, other->bias_params_);
579 }
void AddMat(Real alpha, const CuMatrixBase< Real > &A, MatrixTransposeType trans=kNoTrans)
*this += alpha * A
Definition: cu-matrix.cc:939
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void AddVec(Real alpha, const CuVectorBase< Real > &vec, Real beta=1.0)
Definition: cu-vector.cc:1126
void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 277 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::PrecomputedIndexes::computation, kaldi::nnet3::time_height_convolution::ConvolveBackwardData(), UpdatableComponent::is_gradient_, KALDI_ASSERT, UpdatableComponent::learning_rate_, TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::UpdateNaturalGradient(), TimeHeightConvolutionComponent::UpdateSimple(), and TimeHeightConvolutionComponent::use_natural_gradient_.

285  {
286  const PrecomputedIndexes *indexes =
287  dynamic_cast<const PrecomputedIndexes*>(indexes_in);
288  KALDI_ASSERT(indexes != NULL);
289 
290  if (in_deriv != NULL) {
291  ConvolveBackwardData(indexes->computation, linear_params_,
292  out_deriv, in_deriv);
293  }
294  if (to_update_in != NULL) {
295  TimeHeightConvolutionComponent *to_update =
296  dynamic_cast<TimeHeightConvolutionComponent*>(to_update_in);
297  KALDI_ASSERT(to_update != NULL);
298 
299  if (to_update->learning_rate_ == 0.0)
300  return;
301 
302  if (to_update->is_gradient_ || !to_update->use_natural_gradient_)
303  to_update->UpdateSimple(*indexes, in_value, out_deriv);
304  else
305  to_update->UpdateNaturalGradient(*indexes, in_value, out_deriv);
306  }
307 }
void ConvolveBackwardData(const ConvolutionComputation &cc, const CuMatrixBase< BaseFloat > &params, const CuMatrixBase< BaseFloat > &output_deriv, CuMatrixBase< BaseFloat > *input_deriv)
This does the part of the backward derivative computation of convolution, that propagates derivatives...
Definition: convolution.cc:681
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void Check ( )
private

Definition at line 52 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, ConvolutionModel::Check(), CuVectorBase< Real >::Dim(), KALDI_ASSERT, TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::model_, ConvolutionModel::num_filters_out, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), ConvolutionModel::ParamCols(), and ConvolutionModel::ParamRows().

Referenced by TimeHeightConvolutionComponent::Read(), and TimeHeightConvolutionComponent::TimeHeightConvolutionComponent().

52  {
53  model_.Check();
57 }
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
time_height_convolution::ConvolutionModel model_
bool Check(bool check_heights_used=true, bool allow_height_padding=true) const
Definition: convolution.cc:129
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void ComputeDerived ( )
private

Definition at line 472 of file nnet-convolutional-component.cc.

References ConvolutionModel::all_time_offsets, TimeHeightConvolutionComponent::all_time_offsets_, rnnlm::i, TimeHeightConvolutionComponent::model_, ConvolutionModel::required_time_offsets, and TimeHeightConvolutionComponent::time_offset_required_.

Referenced by TimeHeightConvolutionComponent::InitFromConfig(), and TimeHeightConvolutionComponent::Read().

472  {
473  all_time_offsets_.clear();
474  all_time_offsets_.insert(
475  all_time_offsets_.end(),
476  model_.all_time_offsets.begin(),
477  model_.all_time_offsets.end());
479  for (size_t i = 0; i < all_time_offsets_.size(); i++) {
482  }
483 }
time_height_convolution::ConvolutionModel model_
virtual Component* Copy ( ) const
inlinevirtual

Copies component (deep copy).

Implements Component.

Definition at line 247 of file nnet-convolutional-component.h.

References TimeHeightConvolutionComponent::TimeHeightConvolutionComponent().

BaseFloat DotProduct ( const UpdatableComponent other) const
virtual

Computes dot-product between parameters of two instances of a Component.

Can be used for computing parameter-norm of an UpdatableComponent.

Implements UpdatableComponent.

Definition at line 591 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, KALDI_ASSERT, kaldi::kTrans, TimeHeightConvolutionComponent::linear_params_, kaldi::TraceMatMat(), and kaldi::VecVec().

592  {
593  const TimeHeightConvolutionComponent *other =
594  dynamic_cast<const TimeHeightConvolutionComponent*>(&other_in);
595  KALDI_ASSERT(other != NULL);
596  return TraceMatMat(linear_params_, other->linear_params_, kTrans) +
597  VecVec(bias_params_, other->bias_params_);
598 }
Real TraceMatMat(const MatrixBase< Real > &A, const MatrixBase< Real > &B, MatrixTransposeType trans)
We need to declare this here as it will be a friend function.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:36
void FreezeNaturalGradient ( bool  freeze)
virtual

freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient).

Reimplemented from UpdatableComponent.

Definition at line 623 of file nnet-convolutional-component.cc.

References OnlineNaturalGradient::Freeze(), TimeHeightConvolutionComponent::preconditioner_in_, and TimeHeightConvolutionComponent::preconditioner_out_.

void GetInputIndexes ( const MiscComputationInfo misc_info,
const Index output_index,
std::vector< Index > *  desired_indexes 
) const
virtual

This function only does something interesting for non-simple Components.

For a given index at the output of the component, tells us what indexes are required at its input (note: "required" encompasses also optionally-required things; it will enumerate all things that we'd like to have). See also IsComputable().

Parameters
[in]misc_infoThis argument is supplied to handle things that the framework can't very easily supply: information like which time indexes are needed for AggregateComponent, which time-indexes are available at the input of a recurrent network, and so on. We will add members to misc_info as needed.
[in]output_indexThe Index at the output of the component, for which we are requesting the list of indexes at the component's input.
[out]desired_indexesA list of indexes that are desired at the input. are to be written to here. By "desired" we mean required or optionally-required.

The default implementation of this function is suitable for any SimpleComponent; it just copies the output_index to a single identical element in input_indexes.

Reimplemented from Component.

Definition at line 485 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::all_time_offsets_, rnnlm::i, KALDI_ASSERT, kaldi::nnet3::kNoTime, Index::n, Index::t, and Index::x.

488  {
489  KALDI_ASSERT(output_index.t != kNoTime);
490  size_t size = all_time_offsets_.size();
491  desired_indexes->resize(size);
492  for (size_t i = 0; i < size; i++) {
493  (*desired_indexes)[i].n = output_index.n;
494  (*desired_indexes)[i].t = output_index.t + all_time_offsets_[i];
495  (*desired_indexes)[i].x = output_index.x;
496  }
497 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
const int kNoTime
Definition: nnet-common.cc:554
std::string Info ( ) const
virtual

Returns some text-form information about this component, for diagnostics.

Starts with the type of the component. E.g. "SigmoidComponent dim=900", although most components will have much more info.

Reimplemented from UpdatableComponent.

Definition at line 67 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, OnlineNaturalGradient::GetAlpha(), OnlineNaturalGradient::GetRank(), ConvolutionModel::Info(), UpdatableComponent::Info(), TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::max_memory_mb_, TimeHeightConvolutionComponent::model_, TimeHeightConvolutionComponent::num_minibatches_history_, TimeHeightConvolutionComponent::NumParameters(), TimeHeightConvolutionComponent::preconditioner_in_, TimeHeightConvolutionComponent::preconditioner_out_, kaldi::nnet3::PrintParameterStats(), and TimeHeightConvolutionComponent::use_natural_gradient_.

67  {
68  std::ostringstream stream;
69  // The output of model_.Info() has been designed to be suitable
70  // as a component-level info string, it has
71  // {num-filters,height}-{in-out}, offsets=[...], required-time-offsets=[...],
72  // {input,output}-dim.
73  stream << UpdatableComponent::Info() << ' ' << model_.Info();
74  PrintParameterStats(stream, "filter-params", linear_params_);
75  PrintParameterStats(stream, "bias-params", bias_params_, true);
76  stream << ", num-params=" << NumParameters()
77  << ", max-memory-mb=" << max_memory_mb_
78  << ", use-natural-gradient=" << use_natural_gradient_;
79  if (use_natural_gradient_) {
80  stream << ", num-minibatches-history=" << num_minibatches_history_
81  << ", rank-in=" << preconditioner_in_.GetRank()
82  << ", rank-out=" << preconditioner_out_.GetRank()
83  << ", alpha-in=" << preconditioner_in_.GetAlpha()
84  << ", alpha-out=" << preconditioner_out_.GetAlpha();
85  }
86  return stream.str();
87 }
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
time_height_convolution::ConvolutionModel model_
virtual std::string Info() const
Returns some text-form information about this component, for diagnostics.
void PrintParameterStats(std::ostringstream &os, const std::string &name, const CuVectorBase< BaseFloat > &params, bool include_mean)
Print to 'os' some information about the mean and standard deviation of some parameters, used in Info() functions in nnet-simple-component.cc.
Definition: nnet-parse.cc:520
void InitFromConfig ( ConfigLine cfl)
virtual

Initialize, from a ConfigLine object.

Parameters
[in]cflA ConfigLine containing any parameters that are needed for initialization. For example: "dim=100 param-stddev=0.1"

Implements Component.

Definition at line 116 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, ConvolutionModel::Check(), ConvolutionModel::ComputeDerived(), TimeHeightConvolutionComponent::ComputeDerived(), ConfigLine::GetValue(), ConvolutionModel::height_in, ConvolutionModel::Offset::height_offset, ConvolutionModel::height_out, ConvolutionModel::height_subsample_out, rnnlm::i, UpdatableComponent::InitLearningRatesFromConfig(), TimeHeightConvolutionComponent::InitUnit(), kaldi::IsSortedAndUniq(), rnnlm::j, KALDI_ASSERT, KALDI_ERR, KALDI_WARN, TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::max_memory_mb_, TimeHeightConvolutionComponent::model_, ConvolutionModel::num_filters_in, ConvolutionModel::num_filters_out, TimeHeightConvolutionComponent::num_minibatches_history_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), ConvolutionModel::offsets, ConvolutionModel::ParamCols(), ConvolutionModel::ParamRows(), TimeHeightConvolutionComponent::preconditioner_in_, TimeHeightConvolutionComponent::preconditioner_out_, ConvolutionModel::required_time_offsets, CuVector< Real >::Resize(), CuMatrix< Real >::Resize(), CuVectorBase< Real >::Scale(), CuMatrixBase< Real >::Scale(), OnlineNaturalGradient::SetAlpha(), OnlineNaturalGradient::SetNumSamplesHistory(), CuVectorBase< Real >::SetRandn(), CuMatrixBase< Real >::SetRandn(), OnlineNaturalGradient::SetRank(), kaldi::SplitStringToIntegers(), ConvolutionModel::Offset::time_offset, TimeHeightConvolutionComponent::use_natural_gradient_, and ConfigLine::WholeLine().

116  {
117  // 1. Config values inherited from UpdatableComponent.
119 
120  // 2. convolution-related config values.
121  model_.height_subsample_out = 1; // default.
122  max_memory_mb_ = 200.0;
123  std::string height_offsets, time_offsets, required_time_offsets = "undef";
124 
125  bool ok = cfl->GetValue("num-filters-in", &model_.num_filters_in) &&
126  cfl->GetValue("num-filters-out", &model_.num_filters_out) &&
127  cfl->GetValue("height-in", &model_.height_in) &&
128  cfl->GetValue("height-out", &model_.height_out) &&
129  cfl->GetValue("height-offsets", &height_offsets) &&
130  cfl->GetValue("time-offsets", &time_offsets);
131  if (!ok) {
132  KALDI_ERR << "Bad initializer: expected all the values "
133  "num-filters-in, num-filters-out, height-in, height-out, "
134  "height-offsets, time-offsets to be defined: "
135  << cfl->WholeLine();
136  }
137  // some optional structural configs.
138  cfl->GetValue("required-time-offsets", &required_time_offsets);
139  cfl->GetValue("height-subsample-out", &model_.height_subsample_out);
140  cfl->GetValue("max-memory-mb", &max_memory_mb_);
142  { // This block attempts to parse height_offsets, time_offsets
143  // and required_time_offsets.
144  std::vector<int32> height_offsets_vec,
145  time_offsets_vec, required_time_offsets_vec;
146  if (!SplitStringToIntegers(height_offsets, ",", false,
147  &height_offsets_vec) ||
148  !SplitStringToIntegers(time_offsets, ",", false,
149  &time_offsets_vec)) {
150  KALDI_ERR << "Formatting problem in time-offsets or height-offsets: "
151  << cfl->WholeLine();
152  }
153  if (height_offsets_vec.empty() || !IsSortedAndUniq(height_offsets_vec) ||
154  time_offsets_vec.empty() || !IsSortedAndUniq(time_offsets_vec)) {
155  KALDI_ERR << "Options time-offsets and height-offsets must be nonempty, "
156  "sorted and unique.";
157  }
158  if (required_time_offsets == "undef") {
159  required_time_offsets_vec = time_offsets_vec;
160  } else {
161  if (!SplitStringToIntegers(required_time_offsets, ",", false,
162  &required_time_offsets_vec) ||
163  required_time_offsets_vec.empty() ||
164  !IsSortedAndUniq(required_time_offsets_vec)) {
165  KALDI_ERR << "Formatting problem in required-time-offsets: "
166  << cfl->WholeLine();
167  }
168  }
169  model_.offsets.clear();
170  for (size_t i = 0; i < time_offsets_vec.size(); i++) {
171  for (size_t j = 0; j < height_offsets_vec.size(); j++) {
172  time_height_convolution::ConvolutionModel::Offset offset;
173  offset.time_offset = time_offsets_vec[i];
174  offset.height_offset = height_offsets_vec[j];
175  model_.offsets.push_back(offset);
176  }
177  }
180  required_time_offsets_vec.begin(),
181  required_time_offsets_vec.end());
182  }
183 
185  if (!model_.Check(false, true)) {
186  KALDI_ERR << "Parameters used to initialize TimeHeightConvolutionComponent "
187  << "do not make sense, line was: " << cfl->WholeLine();
188  }
189  if (!model_.Check(true, true)) {
190  KALDI_WARN << "There are input heights unused in "
191  "TimeHeightConvolutionComponent; consider increasing output "
192  "height or decreasing height of preceding layer."
193  << cfl->WholeLine();
194  }
195 
196  // 3. Parameter-initialization configs.
197  BaseFloat param_stddev = -1, bias_stddev = 0.0;
198  bool init_unit = false;
199  cfl->GetValue("param-stddev", &param_stddev);
200  cfl->GetValue("bias-stddev", &bias_stddev);
201  cfl->GetValue("init-unit", &init_unit);
202  if (param_stddev < 0.0) {
203  param_stddev = 1.0 / sqrt(model_.num_filters_in *
204  model_.offsets.size());
205  }
206  // initialize the parameters.
208  if (!init_unit) {
210  linear_params_.Scale(param_stddev);
211  } else {
212  InitUnit();
213  }
216  bias_params_.Scale(bias_stddev);
217 
218 
219  // 4. Natural-gradient related configs.
220  use_natural_gradient_ = true;
222  int32 rank_out = -1, rank_in = -1;
223  BaseFloat alpha_out = 4.0, alpha_in = 4.0;
224  cfl->GetValue("use-natural-gradient", &use_natural_gradient_);
225  cfl->GetValue("rank-in", &rank_in);
226  cfl->GetValue("rank-out", &rank_out);
227  cfl->GetValue("alpha-in", &alpha_in);
228  cfl->GetValue("alpha-out", &alpha_out);
229  cfl->GetValue("num-minibatches-history", &num_minibatches_history_);
230 
231  preconditioner_in_.SetAlpha(alpha_in);
232  preconditioner_out_.SetAlpha(alpha_out);
233  int32 dim_in = linear_params_.NumCols() + 1,
234  dim_out = linear_params_.NumRows();
235  if (rank_in < 0) {
236  rank_in = std::min<int32>(80, (dim_in + 1) / 2);
237  preconditioner_in_.SetRank(rank_in);
238  }
239  if (rank_out < 0) {
240  rank_out = std::min<int32>(80, (dim_out + 1) / 2);
241  preconditioner_out_.SetRank(rank_out);
242  }
243  // the swapping of in and out in the lines below is intentional. the num-rows
244  // of the matrix that we give to preconditioner_in_ to precondition is
245  // dim-out, and the num-rows of the matrix we give to preconditioner_out_ to
246  // preconditioner is dim-in. the preconditioner objects treat these rows
247  // as separate samples, e.g. separate frames, even though they actually
248  // correspond to a different dimension in the parameter space.
250  preconditioner_out_.SetNumSamplesHistory(dim_in * num_minibatches_history_);
251 
252  preconditioner_in_.SetAlpha(alpha_in);
253  preconditioner_out_.SetAlpha(alpha_out);
254 
255  ComputeDerived();
256 }
void Scale(Real value)
Definition: cu-vector.cc:1105
void Scale(Real value)
Definition: cu-matrix.cc:608
bool SplitStringToIntegers(const std::string &full, const char *delim, bool omit_empty_strings, std::vector< I > *out)
Split a string (e.g.
Definition: text-utils.h:68
void SetNumSamplesHistory(BaseFloat num_samples_history)
void InitLearningRatesFromConfig(ConfigLine *cfl)
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
void Resize(MatrixIndexT dim, MatrixResizeType t=kSetZero)
Allocate the memory.
Definition: cu-vector.cc:892
float BaseFloat
Definition: kaldi-types.h:29
void Resize(MatrixIndexT rows, MatrixIndexT cols, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Allocate the memory.
Definition: cu-matrix.cc:47
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
time_height_convolution::ConvolutionModel model_
#define KALDI_ERR
Definition: kaldi-error.h:127
#define KALDI_WARN
Definition: kaldi-error.h:130
bool Check(bool check_heights_used=true, bool allow_height_padding=true) const
Definition: convolution.cc:129
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
bool IsSortedAndUniq(const std::vector< T > &vec)
Returns true if the vector is sorted and contains each element only once.
Definition: stl-utils.h:63
void InitUnit ( )
private

Definition at line 90 of file nnet-convolutional-component.cc.

References rnnlm::i, KALDI_ASSERT, KALDI_ERR, TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::model_, ConvolutionModel::num_filters_in, ConvolutionModel::num_filters_out, CuMatrixBase< Real >::NumRows(), and ConvolutionModel::offsets.

Referenced by TimeHeightConvolutionComponent::InitFromConfig().

90  {
92  KALDI_ERR << "You cannot specify init-unit if the num-filters-in "
93  << "and num-filters-out differ.";
94  }
95  size_t i;
96  int32 zero_offset = 0;
97  for (i = 0; i < model_.offsets.size(); i++) {
98  if (model_.offsets[i].time_offset == 0 &&
99  model_.offsets[i].height_offset == 0) {
100  zero_offset = i;
101  break;
102  }
103  }
104  if (i == model_.offsets.size()) // did not break.
105  KALDI_ERR << "You cannot specify init-unit if the model does "
106  << "not have the offset (0, 0).";
107 
108  CuSubMatrix<BaseFloat> zero_offset_block(
110  zero_offset * model_.num_filters_in, model_.num_filters_in);
111 
112  KALDI_ASSERT(zero_offset_block.NumRows() == zero_offset_block.NumCols());
113  zero_offset_block.AddToDiag(1.0); // set this block to the unit matrix.
114 }
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
time_height_convolution::ConvolutionModel model_
#define KALDI_ERR
Definition: kaldi-error.h:127
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
int32 InputDim ( ) const
virtual

Returns input-dimension of this component.

Implements Component.

Definition at line 59 of file nnet-convolutional-component.cc.

References ConvolutionModel::InputDim(), and TimeHeightConvolutionComponent::model_.

59  {
60  return model_.InputDim();
61 }
time_height_convolution::ConvolutionModel model_
bool IsComputable ( const MiscComputationInfo misc_info,
const Index output_index,
const IndexSet input_index_set,
std::vector< Index > *  used_inputs 
) const
virtual

This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs.

It tells the user whether a given output index is computable from a given set of input indexes, and if so, says which input indexes will be used in the computation.

Implementations of this function are required to have the property that adding an element to "input_index_set" can only ever change IsComputable from false to true, never vice versa.

Parameters
[in]misc_infoSome information specific to the computation, such as minimum and maximum times for certain components to do adaptation on; it's a place to put things that don't easily fit in the framework.
[in]output_indexThe index that is to be computed at the output of this Component.
[in]input_index_setThe set of indexes that is available at the input of this Component.
[out]used_inputsIf this is non-NULL and the output is computable this will be set to the list of input indexes that will actually be used in the computation.
Returns
Returns true iff this output is computable from the provided inputs.

The default implementation of this function is suitable for any SimpleComponent: it just returns true if output_index is in input_index_set, and if so sets used_inputs to vector containing that one Index.

Reimplemented from Component.

Definition at line 500 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::all_time_offsets_, rnnlm::i, KALDI_ASSERT, kaldi::nnet3::kNoTime, Index::t, and TimeHeightConvolutionComponent::time_offset_required_.

504  {
505  KALDI_ASSERT(output_index.t != kNoTime);
506  size_t size = all_time_offsets_.size();
507  Index index(output_index);
508  if (used_inputs != NULL) {
509  used_inputs->clear();
510  used_inputs->reserve(size);
511  for (size_t i = 0; i < size; i++) {
512  index.t = output_index.t + all_time_offsets_[i];
513  if (input_index_set(index)) {
514  // This input index is available.
515  used_inputs->push_back(index);
516  } else {
517  // This input index is not available.
518  if (time_offset_required_[i]) {
519  // A required offset was not present -> this output index is not
520  // computable.
521  used_inputs->clear();
522  return false;
523  }
524  }
525  }
526  // All required time-offsets of the output were computable. -> return true.
527  return true;
528  } else {
529  for (size_t i = 0; i < size; i++) {
530  if (time_offset_required_[i]) {
531  index.t = output_index.t + all_time_offsets_[i];
532  if (!input_index_set(index))
533  return false;
534  }
535  }
536  return true;
537  }
538 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
const int kNoTime
Definition: nnet-common.cc:554
int32 NumParameters ( ) const
virtual

The following new virtual function returns the total dimension of the parameters in this class.

Reimplemented from UpdatableComponent.

Definition at line 600 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, CuVectorBase< Real >::Dim(), TimeHeightConvolutionComponent::linear_params_, CuMatrixBase< Real >::NumCols(), and CuMatrixBase< Real >::NumRows().

Referenced by TimeHeightConvolutionComponent::Info(), TimeHeightConvolutionComponent::UnVectorize(), and TimeHeightConvolutionComponent::Vectorize().

600  {
602  bias_params_.Dim();
603 }
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
int32 OutputDim ( ) const
virtual

Returns output-dimension of this component.

Implements Component.

Definition at line 63 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::model_, and ConvolutionModel::OutputDim().

63  {
64  return model_.OutputDim();
65 }
time_height_convolution::ConvolutionModel model_
void PerturbParams ( BaseFloat  stddev)
virtual

This function is to be used in testing.

It adds unit noise times "stddev" to the parameters of the component.

Implements UpdatableComponent.

Definition at line 581 of file nnet-convolutional-component.cc.

References CuMatrixBase< Real >::AddMat(), CuVectorBase< Real >::AddVec(), TimeHeightConvolutionComponent::bias_params_, CuVectorBase< Real >::Dim(), kaldi::kUndefined, TimeHeightConvolutionComponent::linear_params_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), CuVectorBase< Real >::SetRandn(), and CuMatrixBase< Real >::SetRandn().

581  {
582  CuMatrix<BaseFloat> temp_mat(linear_params_.NumRows(),
584  temp_mat.SetRandn();
585  linear_params_.AddMat(stddev, temp_mat);
586  CuVector<BaseFloat> temp_vec(bias_params_.Dim(), kUndefined);
587  temp_vec.SetRandn();
588  bias_params_.AddVec(stddev, temp_vec);
589 }
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
void AddMat(Real alpha, const CuMatrixBase< Real > &A, MatrixTransposeType trans=kNoTrans)
*this += alpha * A
Definition: cu-matrix.cc:939
void AddVec(Real alpha, const CuVectorBase< Real > &vec, Real beta=1.0)
Definition: cu-vector.cc:1126
ComponentPrecomputedIndexes * PrecomputeIndexes ( const MiscComputationInfo misc_info,
const std::vector< Index > &  input_indexes,
const std::vector< Index > &  output_indexes,
bool  need_backprop 
) const
virtual

This function must return NULL for simple Components.

Returns a pointer to a class that may contain some precomputed component-specific and computation-specific indexes to be in used in the Propagate and Backprop functions.

Parameters
[in]misc_infoThis argument is supplied to handle things that the framework can't very easily supply: information like which time indexes are needed for AggregateComponent, which time-indexes are available at the input of a recurrent network, and so on. misc_info may not even ever be used here. We will add members to misc_info as needed.
[in]input_indexesA vector of indexes that explains what time-indexes (and other indexes) each row of the in/in_value/in_deriv matrices given to Propagate and Backprop will mean.
[in]output_indexesA vector of indexes that explains what time-indexes (and other indexes) each row of the out/out_value/out_deriv matrices given to Propagate and Backprop will mean.
[in]need_backpropTrue if we might need to do backprop with this component, so that if any different indexes are needed for backprop then those should be computed too.
Returns
Returns a child-class of class ComponentPrecomputedIndexes, or NULL if this component for does not need to precompute any indexes (e.g. if it is a simple component and does not care about indexes).

Reimplemented from Component.

Definition at line 541 of file nnet-convolutional-component.cc.

References kaldi::nnet3::time_height_convolution::CompileConvolutionComputation(), TimeHeightConvolutionComponent::PrecomputedIndexes::computation, KALDI_ERR, TimeHeightConvolutionComponent::max_memory_mb_, and TimeHeightConvolutionComponent::model_.

545  {
546  using namespace time_height_convolution;
547  ConvolutionComputationOptions opts;
548  opts.max_memory_mb = max_memory_mb_;
549  PrecomputedIndexes *ans = new PrecomputedIndexes();
550  std::vector<Index> input_indexes_modified,
551  output_indexes_modified;
553  model_, input_indexes, output_indexes, opts,
554  &(ans->computation), &input_indexes_modified, &output_indexes_modified);
555  if (input_indexes_modified != input_indexes ||
556  output_indexes_modified != output_indexes) {
557  KALDI_ERR << "Problem precomputing indexes";
558  }
559  return ans;
560 }
time_height_convolution::ConvolutionModel model_
#define KALDI_ERR
Definition: kaldi-error.h:127
void CompileConvolutionComputation(const ConvolutionModel &model, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, const ConvolutionComputationOptions &opts, ConvolutionComputation *computation, std::vector< Index > *input_indexes_modified, std::vector< Index > *output_indexes_modified)
This function does the compilation for a convolution computation; it's a wrapper for the functions be...
void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 258 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, TimeHeightConvolutionComponent::PrecomputedIndexes::computation, kaldi::nnet3::time_height_convolution::ConvolveForward(), CuMatrixBase< Real >::CopyRowsFromVec(), CuMatrixBase< Real >::Data(), ConvolutionModel::height_out, KALDI_ASSERT, TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::model_, ConvolutionModel::num_filters_out, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), and CuMatrixBase< Real >::Stride().

261  {
262  const PrecomputedIndexes *indexes =
263  dynamic_cast<const PrecomputedIndexes*>(indexes_in);
264  KALDI_ASSERT(indexes != NULL);
265  { // this block handles the bias term.
266  KALDI_ASSERT(out->Stride() == out->NumCols() &&
268  CuSubMatrix<BaseFloat> out_reshaped(
269  out->Data(), out->NumRows() * model_.height_out,
271  out_reshaped.CopyRowsFromVec(bias_params_);
272  }
273  ConvolveForward(indexes->computation, in, linear_params_, out);
274  return NULL;
275 }
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
time_height_convolution::ConvolutionModel model_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void ConvolveForward(const ConvolutionComputation &cc, const CuMatrixBase< BaseFloat > &input, const CuMatrixBase< BaseFloat > &params, CuMatrixBase< BaseFloat > *output)
This does the forward computation of convolution.
Definition: convolution.cc:523
MatrixIndexT Stride() const
Definition: cu-matrix.h:197
const Real * Data() const
Return data pointer (const).
Definition: cu-matrix.h:625
void Read ( std::istream &  is,
bool  binary 
)
virtual

Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed.

Implements Component.

Definition at line 429 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, TimeHeightConvolutionComponent::Check(), TimeHeightConvolutionComponent::ComputeDerived(), kaldi::nnet3::ExpectToken(), KALDI_ASSERT, TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::max_memory_mb_, TimeHeightConvolutionComponent::model_, TimeHeightConvolutionComponent::num_minibatches_history_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), TimeHeightConvolutionComponent::preconditioner_in_, TimeHeightConvolutionComponent::preconditioner_out_, ConvolutionModel::Read(), CuVector< Real >::Read(), CuMatrix< Real >::Read(), kaldi::ReadBasicType(), UpdatableComponent::ReadUpdatableCommon(), OnlineNaturalGradient::SetAlpha(), OnlineNaturalGradient::SetNumSamplesHistory(), OnlineNaturalGradient::SetRank(), and TimeHeightConvolutionComponent::use_natural_gradient_.

429  {
430  std::string token = ReadUpdatableCommon(is, binary);
431  // the next few lines are only for back compatibility.
432  if (token != "") {
433  KALDI_ASSERT(token == "<Model>");
434  } else {
435  ExpectToken(is, binary, "<Model>");
436  }
437  model_.Read(is, binary);
438  ExpectToken(is, binary, "<LinearParams>");
439  linear_params_.Read(is, binary);
440  ExpectToken(is, binary, "<BiasParams>");
441  bias_params_.Read(is, binary);
442  ExpectToken(is, binary, "<MaxMemoryMb>");
443  ReadBasicType(is, binary, &max_memory_mb_);
444  ExpectToken(is, binary, "<UseNaturalGradient>");
445  ReadBasicType(is, binary, &use_natural_gradient_);
446  ExpectToken(is, binary, "<NumMinibatchesHistory>");
448  int32 rank_in, rank_out;
449  BaseFloat alpha_in, alpha_out;
450  ExpectToken(is, binary, "<AlphaInOut>");
451  ReadBasicType(is, binary, &alpha_in);
452  ReadBasicType(is, binary, &alpha_out);
453  preconditioner_in_.SetAlpha(alpha_in);
454  preconditioner_out_.SetAlpha(alpha_out);
455  ExpectToken(is, binary, "<RankInOut>");
456  ReadBasicType(is, binary, &rank_in);
457  ReadBasicType(is, binary, &rank_out);
458  preconditioner_in_.SetRank(rank_in);
459  preconditioner_out_.SetRank(rank_out);
460  int32 dim_in = linear_params_.NumCols() + 1,
461  dim_out = linear_params_.NumRows();
462  // the following lines mirror similar lines in InitFromConfig().
463  // the swapping of in and out is intentional; see comment in InitFromConfig(),
464  // by similar lines.
466  preconditioner_out_.SetNumSamplesHistory(dim_in * num_minibatches_history_);
467  ExpectToken(is, binary, "</TimeHeightConvolutionComponent>");
468  ComputeDerived();
469  Check();
470 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void SetNumSamplesHistory(BaseFloat num_samples_history)
void Read(std::istream &is, bool binary)
I/O.
Definition: cu-vector.cc:862
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
float BaseFloat
Definition: kaldi-types.h:29
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
time_height_convolution::ConvolutionModel model_
std::string ReadUpdatableCommon(std::istream &is, bool binary)
void Read(std::istream &is, bool binary)
I/O functions.
Definition: cu-matrix.cc:459
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void ReorderIndexes ( std::vector< Index > *  input_indexes,
std::vector< Index > *  output_indexes 
) const
virtual

This function only does something interesting for non-simple Components.

It provides an opportunity for a Component to reorder the or pad the indexes at its input and output. This might be useful, for instance, if a component requires a particular ordering of the indexes that doesn't correspond to their natural ordering. Components that might modify the indexes are required to return the kReordersIndexes flag in their Properties(). The ReorderIndexes() function is now allowed to insert blanks into the indexes. The 'blanks' must be of the form (n,kNoTime,x), where the marker kNoTime (a very negative number) is there where the 't' indexes normally live. The reason we don't just have, say, (-1,-1,-1), relates to the need to preserve a regular pattern over the 'n' indexes so that 'shortcut compilation' (c.f. ExpandComputation()) can work correctly

Parameters
[in,out]Indexesat the input of the Component.
[in,out]Indexesat the output of the Component

Reimplemented from Component.

Definition at line 386 of file nnet-convolutional-component.cc.

References kaldi::nnet3::time_height_convolution::CompileConvolutionComputation(), TimeHeightConvolutionComponent::max_memory_mb_, and TimeHeightConvolutionComponent::model_.

388  {
389  using namespace time_height_convolution;
390  ConvolutionComputationOptions opts;
391  opts.max_memory_mb = max_memory_mb_;
392  ConvolutionComputation computation_temp;
393  std::vector<Index> input_indexes_modified,
394  output_indexes_modified;
396  model_, *input_indexes, *output_indexes, opts,
397  &computation_temp, &input_indexes_modified, &output_indexes_modified);
398  input_indexes->swap(input_indexes_modified);
399  output_indexes->swap(output_indexes_modified);
400 }
time_height_convolution::ConvolutionModel model_
void CompileConvolutionComputation(const ConvolutionModel &model, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, const ConvolutionComputationOptions &opts, ConvolutionComputation *computation, std::vector< Index > *input_indexes_modified, std::vector< Index > *output_indexes_modified)
This function does the compilation for a convolution computation; it's a wrapper for the functions be...
void Scale ( BaseFloat  scale)
virtual

This virtual function when called by.

by "scale" when called by an UpdatableComponent. stores stats, like BatchNormComponent– it relates to scaling activation stats, not parameters.

Reimplemented from Component.

Definition at line 562 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, TimeHeightConvolutionComponent::linear_params_, CuVectorBase< Real >::Scale(), CuMatrixBase< Real >::Scale(), CuVectorBase< Real >::SetZero(), and CuMatrixBase< Real >::SetZero().

562  {
563  if (scale == 0.0) {
566  } else {
567  linear_params_.Scale(scale);
568  bias_params_.Scale(scale);
569  }
570 }
void Scale(Real value)
Definition: cu-vector.cc:1105
void Scale(Real value)
Definition: cu-matrix.cc:608
void SetZero()
Math operations, some calling kernels.
Definition: cu-matrix.cc:474
void SetZero()
Math operations.
Definition: cu-vector.cc:988
void ScaleLinearParams ( BaseFloat  alpha)
inline
virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 221 of file nnet-convolutional-component.h.

221 { return "TimeHeightConvolutionComponent"; }
void UnVectorize ( const VectorBase< BaseFloat > &  params)
virtual

Converts the parameters from vector form.

Reimplemented from UpdatableComponent.

Definition at line 614 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, CuVectorBase< Real >::CopyFromVec(), CuMatrixBase< Real >::CopyRowsFromVec(), VectorBase< Real >::Dim(), CuVectorBase< Real >::Dim(), KALDI_ASSERT, TimeHeightConvolutionComponent::linear_params_, CuMatrixBase< Real >::NumCols(), TimeHeightConvolutionComponent::NumParameters(), CuMatrixBase< Real >::NumRows(), and VectorBase< Real >::Range().

615  {
616  KALDI_ASSERT(params.Dim() == NumParameters());
617  int32 linear_size = linear_params_.NumRows() * linear_params_.NumCols(),
618  bias_size = bias_params_.Dim();
619  linear_params_.CopyRowsFromVec(params.Range(0, linear_size));
620  bias_params_.CopyFromVec(params.Range(linear_size, bias_size));
621 }
void CopyRowsFromVec(const CuVectorBase< Real > &v)
This function has two modes of operation.
Definition: cu-matrix.cc:2144
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
void CopyFromVec(const CuVectorBase< Real > &src)
Copy functions; these will crash if the dimension do not match.
Definition: cu-vector.cc:970
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:62
SubVector< Real > Range(const MatrixIndexT o, const MatrixIndexT l)
Returns a sub-vector of a vector (a range of elements).
Definition: kaldi-vector.h:92
void UpdateNaturalGradient ( const PrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv 
)
private

Definition at line 329 of file nnet-convolutional-component.cc.

References CuMatrixBase< Real >::AddMat(), CuVectorBase< Real >::AddVec(), TimeHeightConvolutionComponent::bias_params_, TimeHeightConvolutionComponent::PrecomputedIndexes::computation, kaldi::nnet3::time_height_convolution::ConvolveBackwardParams(), CuMatrixBase< Real >::CopyColFromVec(), CuMatrixBase< Real >::Data(), CuVectorBase< Real >::Dim(), ConvolutionModel::height_out, KALDI_ASSERT, kaldi::kTrans, UpdatableComponent::learning_rate_, TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::model_, ConvolutionModel::num_filters_out, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), OnlineNaturalGradient::PreconditionDirections(), TimeHeightConvolutionComponent::preconditioner_in_, TimeHeightConvolutionComponent::preconditioner_out_, CuMatrixBase< Real >::Row(), CuMatrixBase< Real >::RowRange(), and CuMatrixBase< Real >::Stride().

Referenced by TimeHeightConvolutionComponent::Backprop().

332  {
333 
334  CuVector<BaseFloat> bias_temp(bias_params_.Dim());
335 
336  { // this block computes 'bias_temp', the derivative w.r.t. the bias.
337  KALDI_ASSERT(out_deriv.Stride() == out_deriv.NumCols() &&
338  out_deriv.NumCols() ==
340  CuSubMatrix<BaseFloat> out_deriv_reshaped(
341  out_deriv.Data(), out_deriv.NumRows() * model_.height_out,
343  bias_temp.AddRowSumMat(1.0, out_deriv_reshaped);
344  }
345 
346  CuMatrix<BaseFloat> params_temp(linear_params_.NumRows(),
347  linear_params_.NumCols() + 1);
348  params_temp.CopyColFromVec(bias_temp, linear_params_.NumCols());
349 
350 
351  CuSubMatrix<BaseFloat> linear_params_temp(
352  params_temp, 0, linear_params_.NumRows(),
353  0, linear_params_.NumCols());
354 
355  ConvolveBackwardParams(indexes.computation, in_value, out_deriv,
356  1.0, &linear_params_temp);
357 
358  // the precondition-directions code outputs a scalar that
359  // must be multiplied by its output (this saves one
360  // CUDA operation internally).
361  // We don't bother applying this scale before doing the other
362  // dimenson of natural gradient, because although it's not
363  // invariant to scalar multiplication of the input if the
364  // scalars are different across iterations, the scalars
365  // will be pretty similar on different iterations
366  BaseFloat scale1, scale2;
367  preconditioner_in_.PreconditionDirections(&params_temp, NULL,
368  &scale1);
369 
370 
371  CuMatrix<BaseFloat> params_temp_transpose(params_temp, kTrans);
372  preconditioner_out_.PreconditionDirections(&params_temp_transpose,
373  NULL, &scale2);
374 
375 
377  learning_rate_ * scale1 * scale2,
378  params_temp_transpose.RowRange(0, linear_params_.NumCols()),
379  kTrans);
380 
381  bias_params_.AddVec(learning_rate_ * scale1 * scale2,
382  params_temp_transpose.Row(linear_params_.NumCols()));
383 }
void ConvolveBackwardParams(const ConvolutionComputation &cc, const CuMatrixBase< BaseFloat > &input, const CuMatrixBase< BaseFloat > &output_deriv, BaseFloat alpha, CuMatrixBase< BaseFloat > *params_deriv)
This does the part of the backward derivative computation of convolution, that computes derivatives w...
Definition: convolution.cc:839
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
float BaseFloat
Definition: kaldi-types.h:29
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
time_height_convolution::ConvolutionModel model_
BaseFloat learning_rate_
learning rate (typically 0.0..0.01)
void AddMat(Real alpha, const CuMatrixBase< Real > &A, MatrixTransposeType trans=kNoTrans)
*this += alpha * A
Definition: cu-matrix.cc:939
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void PreconditionDirections(CuMatrixBase< BaseFloat > *R, CuVectorBase< BaseFloat > *row_prod, BaseFloat *scale)
void AddVec(Real alpha, const CuVectorBase< Real > &vec, Real beta=1.0)
Definition: cu-vector.cc:1126
MatrixIndexT Stride() const
Definition: cu-matrix.h:197
const Real * Data() const
Return data pointer (const).
Definition: cu-matrix.h:625
void UpdateSimple ( const PrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv 
)
private

Definition at line 309 of file nnet-convolutional-component.cc.

References CuVectorBase< Real >::AddRowSumMat(), TimeHeightConvolutionComponent::bias_params_, TimeHeightConvolutionComponent::PrecomputedIndexes::computation, kaldi::nnet3::time_height_convolution::ConvolveBackwardParams(), CuMatrixBase< Real >::Data(), ConvolutionModel::height_out, KALDI_ASSERT, UpdatableComponent::learning_rate_, TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::model_, ConvolutionModel::num_filters_out, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), and CuMatrixBase< Real >::Stride().

Referenced by TimeHeightConvolutionComponent::Backprop().

312  {
313 
314  { // this block handles the bias term.
315  KALDI_ASSERT(out_deriv.Stride() == out_deriv.NumCols() &&
316  out_deriv.NumCols() ==
318  CuSubMatrix<BaseFloat> out_deriv_reshaped(
319  out_deriv.Data(), out_deriv.NumRows() * model_.height_out,
321  bias_params_.AddRowSumMat(learning_rate_, out_deriv_reshaped);
322  }
323 
324  ConvolveBackwardParams(indexes.computation, in_value, out_deriv,
326 }
void ConvolveBackwardParams(const ConvolutionComputation &cc, const CuMatrixBase< BaseFloat > &input, const CuMatrixBase< BaseFloat > &output_deriv, BaseFloat alpha, CuMatrixBase< BaseFloat > *params_deriv)
This does the part of the backward derivative computation of convolution, that computes derivatives w...
Definition: convolution.cc:839
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
void AddRowSumMat(Real alpha, const CuMatrixBase< Real > &mat, Real beta=1.0)
Sum the rows of the matrix, add to vector.
Definition: cu-vector.cc:1166
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
time_height_convolution::ConvolutionModel model_
BaseFloat learning_rate_
learning rate (typically 0.0..0.01)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
MatrixIndexT Stride() const
Definition: cu-matrix.h:197
const Real * Data() const
Return data pointer (const).
Definition: cu-matrix.h:625
void Vectorize ( VectorBase< BaseFloat > *  params) const
virtual

Turns the parameters into vector form.

We put the vector form on the CPU, because in the kinds of situations where we do this, we'll tend to use too much memory for the GPU.

Reimplemented from UpdatableComponent.

Definition at line 605 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, VectorBase< Real >::Dim(), CuVectorBase< Real >::Dim(), KALDI_ASSERT, TimeHeightConvolutionComponent::linear_params_, CuMatrixBase< Real >::NumCols(), TimeHeightConvolutionComponent::NumParameters(), CuMatrixBase< Real >::NumRows(), and VectorBase< Real >::Range().

606  {
607  KALDI_ASSERT(params->Dim() == NumParameters());
608  int32 linear_size = linear_params_.NumRows() * linear_params_.NumCols(),
609  bias_size = bias_params_.Dim();
610  params->Range(0, linear_size).CopyRowsFromMat(linear_params_);
611  params->Range(linear_size, bias_size).CopyFromVec(bias_params_);
612 }
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
MatrixIndexT Dim() const
Returns the dimension of the vector.
Definition: kaldi-vector.h:62
SubVector< Real > Range(const MatrixIndexT o, const MatrixIndexT l)
Returns a sub-vector of a vector (a range of elements).
Definition: kaldi-vector.h:92
void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Implements Component.

Definition at line 402 of file nnet-convolutional-component.cc.

References TimeHeightConvolutionComponent::bias_params_, OnlineNaturalGradient::GetAlpha(), OnlineNaturalGradient::GetRank(), TimeHeightConvolutionComponent::linear_params_, TimeHeightConvolutionComponent::max_memory_mb_, TimeHeightConvolutionComponent::model_, TimeHeightConvolutionComponent::num_minibatches_history_, TimeHeightConvolutionComponent::preconditioner_in_, TimeHeightConvolutionComponent::preconditioner_out_, TimeHeightConvolutionComponent::use_natural_gradient_, ConvolutionModel::Write(), CuVector< Real >::Write(), CuMatrixBase< Real >::Write(), kaldi::WriteBasicType(), kaldi::WriteToken(), and UpdatableComponent::WriteUpdatableCommon().

402  {
403  WriteUpdatableCommon(os, binary); // Write opening tag and learning rate.
404  WriteToken(os, binary, "<Model>");
405  model_.Write(os, binary);
406  WriteToken(os, binary, "<LinearParams>");
407  linear_params_.Write(os, binary);
408  WriteToken(os, binary, "<BiasParams>");
409  bias_params_.Write(os, binary);
410  WriteToken(os, binary, "<MaxMemoryMb>");
411  WriteBasicType(os, binary, max_memory_mb_);
412  WriteToken(os, binary, "<UseNaturalGradient>");
414  WriteToken(os, binary, "<NumMinibatchesHistory>");
416  int32 rank_in = preconditioner_in_.GetRank(),
417  rank_out = preconditioner_out_.GetRank();
419  alpha_out = preconditioner_out_.GetAlpha();
420  WriteToken(os, binary, "<AlphaInOut>");
421  WriteBasicType(os, binary, alpha_in);
422  WriteBasicType(os, binary, alpha_out);
423  WriteToken(os, binary, "<RankInOut>");
424  WriteBasicType(os, binary, rank_in);
425  WriteBasicType(os, binary, rank_out);
426  WriteToken(os, binary, "</TimeHeightConvolutionComponent>");
427 }
void Write(std::ostream &is, bool binary) const
Definition: cu-vector.cc:872
float BaseFloat
Definition: kaldi-types.h:29
void WriteUpdatableCommon(std::ostream &is, bool binary) const
time_height_convolution::ConvolutionModel model_
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void Write(std::ostream &os, bool binary) const
Definition: convolution.cc:224
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34
void Write(std::ostream &os, bool binary) const
Definition: cu-matrix.cc:467

Member Data Documentation

std::vector<bool> time_offset_required_
private

The documentation for this class was generated from the following files: