BlockAffineComponent Class Reference

This class implements an affine transform using a block diagonal matrix e.g., one whose weight matrix is all zeros except for blocks on the diagonal. More...

#include <nnet-simple-component.h>

Inheritance diagram for BlockAffineComponent:
Collaboration diagram for BlockAffineComponent:

Public Member Functions

virtual int32 InputDim () const
 Returns input-dimension of this component. More...
 
virtual int32 OutputDim () const
 Returns output-dimension of this component. More...
 
virtual std::string Info () const
 Returns some text-form information about this component, for diagnostics. More...
 
virtual void InitFromConfig (ConfigLine *cfl)
 Initialize, from a ConfigLine object. More...
 
 BlockAffineComponent ()
 
virtual std::string Type () const
 Returns a string such as "SigmoidComponent", describing the type of the object. More...
 
virtual int32 Properties () const
 Return bitmask of the component's properties. More...
 
virtual void * Propagate (const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Propagate function. More...
 
virtual void Backprop (const std::string &debug_info, const ComponentPrecomputedIndexes *indexes, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &, const CuMatrixBase< BaseFloat > &out_deriv, void *memo, Component *to_update, CuMatrixBase< BaseFloat > *in_deriv) const
 Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update. More...
 
virtual void Read (std::istream &is, bool binary)
 Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual ComponentCopy () const
 Copies component (deep copy). More...
 
virtual void Scale (BaseFloat scale)
 This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent. More...
 
virtual void Add (BaseFloat alpha, const Component &other)
 This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void PerturbParams (BaseFloat stddev)
 This function is to be used in testing. More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Computes dot-product between parameters of two instances of a Component. More...
 
virtual int32 NumParameters () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
 BlockAffineComponent (const BlockAffineComponent &other)
 
 BlockAffineComponent (const RepeatedAffineComponent &rac)
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
virtual void SetUnderlyingLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent- gets multiplied by learning_rate_factor_. More...
 
virtual void SetActualLearningRate (BaseFloat lrate)
 Sets the learning rate directly, bypassing learning_rate_factor_. More...
 
virtual void SetAsGradient ()
 Sets is_gradient_ to true and sets learning_rate_ to 1, ignoring learning_rate_factor_. More...
 
virtual BaseFloat LearningRateFactor ()
 
virtual void SetLearningRateFactor (BaseFloat lrate_factor)
 
void SetUpdatableConfigs (const UpdatableComponent &other)
 
virtual void FreezeNaturalGradient (bool freeze)
 freezes/unfreezes NaturalGradient updates, if applicable (to be overriden by components that use Natural Gradient). More...
 
BaseFloat LearningRate () const
 Gets the learning rate to be used in gradient descent. More...
 
BaseFloat MaxChange () const
 Returns the per-component max-change value, which is interpreted as the maximum change (in l2 norm) in parameters that is allowed per minibatch for this component. More...
 
void SetMaxChange (BaseFloat max_change)
 
BaseFloat L2Regularization () const
 Returns the l2 regularization constant, which may be set in any updatable component (usually from the config file). More...
 
void SetL2Regularization (BaseFloat a)
 
- Public Member Functions inherited from Component
virtual void StoreStats (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, void *memo)
 This function may store stats on average activation values, and for some component types, the average value of the derivative of the nonlinearity. More...
 
virtual void ZeroStats ()
 Components that provide an implementation of StoreStats should also provide an implementation of ZeroStats(), to set those stats to zero. More...
 
virtual void GetInputIndexes (const MiscComputationInfo &misc_info, const Index &output_index, std::vector< Index > *desired_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual bool IsComputable (const MiscComputationInfo &misc_info, const Index &output_index, const IndexSet &input_index_set, std::vector< Index > *used_inputs) const
 This function only does something interesting for non-simple Components, and it exists to make it possible to manage optionally-required inputs. More...
 
virtual void ReorderIndexes (std::vector< Index > *input_indexes, std::vector< Index > *output_indexes) const
 This function only does something interesting for non-simple Components. More...
 
virtual ComponentPrecomputedIndexesPrecomputeIndexes (const MiscComputationInfo &misc_info, const std::vector< Index > &input_indexes, const std::vector< Index > &output_indexes, bool need_backprop) const
 This function must return NULL for simple Components. More...
 
virtual void DeleteMemo (void *memo) const
 This virtual function only needs to be overwritten by Components that return a non-NULL memo from their Propagate() function. More...
 
virtual void ConsolidateMemory ()
 This virtual function relates to memory management, and avoiding fragmentation. More...
 
 Component ()
 
virtual ~Component ()
 

Protected Attributes

CuMatrix< BaseFloatlinear_params_
 
CuVector< BaseFloatbias_params_
 
int32 num_blocks_
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (typically 0.0..0.01) More...
 
BaseFloat learning_rate_factor_
 learning rate factor (normally 1.0, but can be set to another < value so that when < you call SetLearningRate(), that value will be scaled by this factor. More...
 
BaseFloat l2_regularize_
 L2 regularization constant. More...
 
bool is_gradient_
 True if this component is to be treated as a gradient rather than as parameters. More...
 
BaseFloat max_change_
 configuration value for imposing max-change More...
 

Private Member Functions

void Init (int32 input_dim, int32 output_dim, int32 num_blocks, BaseFloat param_stddev, BaseFloat bias_mean, BaseFloat bias_stddev)
 
const BlockAffineComponentoperator= (const BlockAffineComponent &other)
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream (works out its type). Dies on error. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Returns a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from UpdatableComponent
void InitLearningRatesFromConfig (ConfigLine *cfl)
 
std::string ReadUpdatableCommon (std::istream &is, bool binary)
 
void WriteUpdatableCommon (std::ostream &is, bool binary) const
 

Detailed Description

This class implements an affine transform using a block diagonal matrix e.g., one whose weight matrix is all zeros except for blocks on the diagonal.

All these blocks have the same dimensions. input-dim: num cols of block diagonal matrix. output-dim: num rows of block diagonal matrix. num-blocks: number of blocks in diagonal of the matrix. num-blocks must divide both input-dim and output-dim

Definition at line 505 of file nnet-simple-component.h.

Constructor & Destructor Documentation

◆ BlockAffineComponent() [1/3]

◆ BlockAffineComponent() [2/3]

BlockAffineComponent ( const BlockAffineComponent other)
explicit

Definition at line 1662 of file nnet-simple-component.cc.

1662  :
1663  UpdatableComponent(other),
1664  linear_params_(other.linear_params_),
1665  bias_params_(other.bias_params_),
1666  num_blocks_(other.num_blocks_) {}

◆ BlockAffineComponent() [3/3]

BlockAffineComponent ( const RepeatedAffineComponent rac)
explicit

Definition at line 1668 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, RepeatedAffineComponent::bias_params_, CuMatrixBase< Real >::CopyFromMat(), CuVectorBase< Real >::CopyFromVec(), BlockAffineComponent::linear_params_, RepeatedAffineComponent::linear_params_, and BlockAffineComponent::num_blocks_.

1668  :
1669  UpdatableComponent(rac),
1670  linear_params_(rac.num_repeats_ * rac.linear_params_.NumRows(),
1671  rac.linear_params_.NumCols(), kUndefined),
1672  bias_params_(rac.num_repeats_ * rac.linear_params_.NumRows(), kUndefined),
1673  num_blocks_(rac.num_repeats_) {
1674  // copy rac's linear_params_ and bias_params_ to this.
1675  int32 num_rows_in_block = rac.linear_params_.NumRows();
1676  for(int32 block_counter = 0; block_counter < num_blocks_; block_counter++) {
1677  int32 row_offset = block_counter * num_rows_in_block;
1678  CuSubMatrix<BaseFloat> block = this->linear_params_.RowRange(row_offset,
1679  num_rows_in_block);
1680  block.CopyFromMat(rac.linear_params_);
1681  CuSubVector<BaseFloat> block_bias = this->bias_params_.Range(row_offset,
1682  num_rows_in_block);
1683  block_bias.CopyFromVec(rac.bias_params_);
1684  }
1685 }
kaldi::int32 int32

Member Function Documentation

◆ Add()

void Add ( BaseFloat  alpha,
const Component other 
)
virtual

This virtual function when called by – an UpdatableComponent adds the parameters of another updatable component, times some constant, to the current parameters.

– a NonlinearComponent (or another component that stores stats, like BatchNormComponent)– it relates to adding stats. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 1873 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, KALDI_ASSERT, and BlockAffineComponent::linear_params_.

1873  {
1874  const BlockAffineComponent *other =
1875  dynamic_cast<const BlockAffineComponent *>(&other_in);
1876  KALDI_ASSERT(other != NULL);
1877  linear_params_.AddMat(alpha, other->linear_params_);
1878  bias_params_.AddVec(alpha, other->bias_params_);
1879 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Backprop()

void Backprop ( const std::string &  debug_info,
const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_value,
const CuMatrixBase< BaseFloat > &  out_deriv,
void *  memo,
Component to_update,
CuMatrixBase< BaseFloat > *  in_deriv 
) const
virtual

Backprop function; depending on which of the arguments 'to_update' and 'in_deriv' are non-NULL, this can compute input-data derivatives and/or perform model update.

Parameters
[in]debug_infoThe component name, to be printed out in any warning messages.
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]in_valueThe matrix that was given as input to the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsInput == 0.
[in]out_valueThe matrix that was output from the Propagate function. Will be ignored (and may be empty) if Properties()&kBackpropNeedsOutput == 0
[in]out_derivThe derivative at the output of this component.
[in]memoThis will normally be NULL, but for component types that set the flag kUsesMemo, this will be the return value of the Propagate() function that corresponds to this Backprop() function. Ownership of any pointers is not transferred to the Backprop function; DeleteMemo() will be called to delete it.
[out]to_updateIf model update is desired, the Component to be updated, else NULL. Does not have to be identical to this. If supplied, you can assume that to_update->Properties() & kUpdatableComponent is nonzero.
[out]in_derivThe derivative at the input of this component, if needed (else NULL). If Properties()&kBackpropInPlace, may be the same matrix as out_deriv. If Properties()&kBackpropAdds, this is added to by the Backprop routine, else it is set. The component code chooses which mode to work in, based on convenience.

Implements Component.

Definition at line 1776 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, CuMatrixBase< Real >::ColRange(), kaldi::DeletePointers(), kaldi::kNoTrans, kaldi::kTrans, UpdatableComponent::learning_rate_, BlockAffineComponent::linear_params_, BlockAffineComponent::num_blocks_, and NVTX_RANGE.

1783  {
1784  NVTX_RANGE("BlockAffineComponent::Backprop");
1785  BlockAffineComponent *to_update = dynamic_cast<BlockAffineComponent*>(to_update_in);
1786 
1787  const int32 num_rows_in_block = linear_params_.NumRows() / num_blocks_;
1788  const int32 num_cols_in_block = linear_params_.NumCols();
1789 
1790  // Propagate the derivative back to the input.
1791  // add with coefficient 1.0 since property kBackpropAdds is true.
1792  // If we wanted to add with coefficient 0.0 we'd need to zero the
1793  // in_deriv, in case of infinities.
1794  if (in_deriv) {
1795  std::vector<CuSubMatrix<BaseFloat> *> in_deriv_batch, out_deriv_batch, linear_params_batch;
1796 
1797  for(int block_counter = 0; block_counter < num_blocks_; block_counter++) {
1798  CuSubMatrix<BaseFloat> *in_deriv_block =
1799  new CuSubMatrix<BaseFloat>(in_deriv->ColRange(block_counter * num_cols_in_block,
1800  num_cols_in_block));
1801  in_deriv_batch.push_back(in_deriv_block);
1802 
1803  CuSubMatrix<BaseFloat> *out_deriv_block =
1804  new CuSubMatrix<BaseFloat>(out_deriv.ColRange(block_counter * num_rows_in_block,
1805  num_rows_in_block));
1806  out_deriv_batch.push_back(out_deriv_block);
1807 
1808  CuSubMatrix<BaseFloat> *linear_params_block =
1809  new CuSubMatrix<BaseFloat>(linear_params_.RowRange(block_counter * num_rows_in_block,
1810  num_rows_in_block));
1811  linear_params_batch.push_back(linear_params_block);
1812  }
1813 
1814  AddMatMatBatched<BaseFloat>(1.0, in_deriv_batch, out_deriv_batch, kNoTrans,
1815  linear_params_batch, kNoTrans, 1.0);
1816 
1817  DeletePointers(&in_deriv_batch);
1818  DeletePointers(&out_deriv_batch);
1819  DeletePointers(&linear_params_batch);
1820  }
1821 
1822  if (to_update != NULL) {
1823 
1824  { // linear params update
1825 
1826  std::vector<CuSubMatrix<BaseFloat> *> in_value_batch,
1827  out_deriv_batch, linear_params_batch;
1828 
1829  for (int block_counter = 0; block_counter < num_blocks_; block_counter++) {
1830  CuSubMatrix<BaseFloat> *in_value_block =
1831  new CuSubMatrix<BaseFloat>(in_value.ColRange(block_counter * num_cols_in_block,
1832  num_cols_in_block));
1833  in_value_batch.push_back(in_value_block);
1834 
1835  CuSubMatrix<BaseFloat> *out_deriv_block =
1836  new CuSubMatrix<BaseFloat>(out_deriv.ColRange(block_counter * num_rows_in_block,
1837  num_rows_in_block));
1838  out_deriv_batch.push_back(out_deriv_block);
1839 
1840  CuSubMatrix<BaseFloat> *linear_params_block =
1841  new CuSubMatrix<BaseFloat>(to_update->linear_params_.RowRange(block_counter * num_rows_in_block,
1842  num_rows_in_block));
1843  linear_params_batch.push_back(linear_params_block);
1844  }
1845 
1846  AddMatMatBatched<BaseFloat>(to_update->learning_rate_,
1847  linear_params_batch,
1848  out_deriv_batch, kTrans,
1849  in_value_batch, kNoTrans, 1.0);
1850 
1851  DeletePointers(&in_value_batch);
1852  DeletePointers(&out_deriv_batch);
1853  DeletePointers(&linear_params_batch);
1854  } // end linear params update
1855 
1856  { // bias update
1857  to_update->bias_params_.AddRowSumMat(to_update->learning_rate_,
1858  out_deriv, 1.0);
1859  } // end bias update
1860  }
1861 }
void DeletePointers(std::vector< A *> *v)
Deletes any non-NULL pointers in the vector v, and sets the corresponding entries of v to NULL...
Definition: stl-utils.h:184
kaldi::int32 int32
#define NVTX_RANGE(name)
Definition: cu-common.h:143

◆ Copy()

Component * Copy ( ) const
virtual

Copies component (deep copy).

Implements Component.

Definition at line 1687 of file nnet-simple-component.cc.

References BlockAffineComponent::BlockAffineComponent().

1687  {
1688  BlockAffineComponent *ans = new BlockAffineComponent(*this);
1689  return ans;
1690 }

◆ DotProduct()

BaseFloat DotProduct ( const UpdatableComponent other) const
virtual

Computes dot-product between parameters of two instances of a Component.

Can be used for computing parameter-norm of an UpdatableComponent.

Implements UpdatableComponent.

Definition at line 1891 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, kaldi::kTrans, BlockAffineComponent::linear_params_, kaldi::TraceMatMat(), and kaldi::VecVec().

1891  {
1892  const BlockAffineComponent *other =
1893  dynamic_cast<const BlockAffineComponent*>(&other_in);
1894  return TraceMatMat(linear_params_, other->linear_params_, kTrans) +
1895  VecVec(bias_params_, other->bias_params_);
1896 }
Real TraceMatMat(const MatrixBase< Real > &A, const MatrixBase< Real > &B, MatrixTransposeType trans)
We need to declare this here as it will be a friend function.
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:37

◆ Info()

std::string Info ( ) const
virtual

Returns some text-form information about this component, for diagnostics.

Starts with the type of the component. E.g. "SigmoidComponent dim=900", although most components will have much more info.

Reimplemented from UpdatableComponent.

Definition at line 1692 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, UpdatableComponent::Info(), BlockAffineComponent::linear_params_, BlockAffineComponent::num_blocks_, and kaldi::nnet3::PrintParameterStats().

1692  {
1693  std::ostringstream stream;
1694  stream << UpdatableComponent::Info()
1695  << ", num-blocks=" << num_blocks_;
1696  PrintParameterStats(stream, "linear-params", linear_params_);
1697  PrintParameterStats(stream, "bias", bias_params_, true);
1698  return stream.str();
1699 }
virtual std::string Info() const
Returns some text-form information about this component, for diagnostics.
void PrintParameterStats(std::ostringstream &os, const std::string &name, const CuVectorBase< BaseFloat > &params, bool include_mean)
Print to &#39;os&#39; some information about the mean and standard deviation of some parameters, used in Info() functions in nnet-simple-component.cc.
Definition: nnet-parse.cc:157

◆ Init()

void Init ( int32  input_dim,
int32  output_dim,
int32  num_blocks,
BaseFloat  param_stddev,
BaseFloat  bias_mean,
BaseFloat  bias_stddev 
)
private

Definition at line 1701 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, KALDI_ASSERT, BlockAffineComponent::linear_params_, and BlockAffineComponent::num_blocks_.

Referenced by BlockAffineComponent::InitFromConfig().

1704  {
1705  KALDI_ASSERT(input_dim > 0 && output_dim > 0 && num_blocks >= 1);
1706  KALDI_ASSERT(output_dim % num_blocks == 0 && input_dim % num_blocks == 0);
1707  const int32 num_columns_per_block = input_dim / num_blocks;
1708  linear_params_.Resize(output_dim, num_columns_per_block);
1709  bias_params_.Resize(output_dim);
1710  KALDI_ASSERT(param_stddev >= 0.0 && bias_stddev >= 0.0);
1711  linear_params_.SetRandn();
1712  linear_params_.Scale(param_stddev);
1713  bias_params_.SetRandn();
1714  bias_params_.Scale(bias_stddev);
1715  bias_params_.Add(bias_mean);
1716  num_blocks_ = num_blocks;
1717 }
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ InitFromConfig()

void InitFromConfig ( ConfigLine cfl)
virtual

Initialize, from a ConfigLine object.

Parameters
[in]cflA ConfigLine containing any parameters that are needed for initialization. For example: "dim=100 param-stddev=0.1"

Implements Component.

Definition at line 1719 of file nnet-simple-component.cc.

References ConfigLine::GetValue(), ConfigLine::HasUnusedValues(), BlockAffineComponent::Init(), UpdatableComponent::InitLearningRatesFromConfig(), KALDI_ERR, BlockAffineComponent::Type(), and ConfigLine::WholeLine().

1719  {
1720  int32 input_dim = -1, output_dim = -1, num_blocks = -1;
1721  if(!cfl->GetValue("input-dim", &input_dim) ||
1722  !cfl->GetValue("output-dim", &output_dim) ||
1723  !cfl->GetValue("num-blocks", &num_blocks))
1724  KALDI_ERR << "Invalid initializer for layer of type "
1725  << Type() << ": \"" << cfl->WholeLine() << "\"";
1727  BaseFloat param_stddev = 1.0 / std::sqrt(input_dim / num_blocks),
1728  bias_mean = 0.0, bias_stddev = 1.0;
1729  cfl->GetValue("param-stddev", &param_stddev);
1730  cfl->GetValue("bias-stddev", &bias_stddev);
1731  cfl->GetValue("bias-mean", &bias_mean);
1732 
1733  if (cfl->HasUnusedValues())
1734  KALDI_ERR << "Invalid initializer for layer of type "
1735  << Type() << ": \"" << cfl->WholeLine() << "\"";
1736 
1737  Init(input_dim, output_dim, num_blocks,
1738  param_stddev, bias_mean, bias_stddev);
1739 }
void InitLearningRatesFromConfig(ConfigLine *cfl)
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
void Init(int32 input_dim, int32 output_dim, int32 num_blocks, BaseFloat param_stddev, BaseFloat bias_mean, BaseFloat bias_stddev)
#define KALDI_ERR
Definition: kaldi-error.h:147
virtual std::string Type() const
Returns a string such as "SigmoidComponent", describing the type of the object.

◆ InputDim()

virtual int32 InputDim ( ) const
inlinevirtual

Returns input-dimension of this component.

Implements Component.

Definition at line 507 of file nnet-simple-component.h.

◆ NumParameters()

int32 NumParameters ( ) const
virtual

The following new virtual function returns the total dimension of the parameters in this class.

Reimplemented from UpdatableComponent.

Definition at line 1926 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, and BlockAffineComponent::linear_params_.

Referenced by BlockAffineComponent::UnVectorize(), and BlockAffineComponent::Vectorize().

1926  {
1927  return linear_params_.NumCols() * linear_params_.NumRows() + bias_params_.Dim();
1928 }

◆ operator=()

const BlockAffineComponent& operator= ( const BlockAffineComponent other)
private

◆ OutputDim()

virtual int32 OutputDim ( ) const
inlinevirtual

Returns output-dimension of this component.

Implements Component.

Definition at line 508 of file nnet-simple-component.h.

References Component::Info(), and PnormComponent::InitFromConfig().

508 { return linear_params_.NumRows(); }

◆ PerturbParams()

void PerturbParams ( BaseFloat  stddev)
virtual

This function is to be used in testing.

It adds unit noise times "stddev" to the parameters of the component.

Implements UpdatableComponent.

Definition at line 1881 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, BlockAffineComponent::linear_params_, CuVectorBase< Real >::SetRandn(), and CuMatrixBase< Real >::SetRandn().

1881  {
1882  CuMatrix<BaseFloat> temp_linear_params(linear_params_);
1883  temp_linear_params.SetRandn();
1884  linear_params_.AddMat(stddev, temp_linear_params);
1885 
1886  CuVector<BaseFloat> temp_bias_params(bias_params_);
1887  temp_bias_params.SetRandn();
1888  bias_params_.AddVec(stddev, temp_bias_params);
1889 }

◆ Propagate()

void * Propagate ( const ComponentPrecomputedIndexes indexes,
const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
) const
virtual

Propagate function.

Parameters
[in]indexesA pointer to some information output by this class's PrecomputeIndexes function (will be NULL for simple components, i.e. those that don't do things like splicing).
[in]inThe input to this component. Num-columns == InputDim().
[out]outThe output of this component. Num-columns == OutputDim(). Note: output of this component will be added to the initial value of "out" if Properties()&kPropagateAdds != 0; otherwise the output will be set and the initial value ignored. Each Component chooses whether it is more convenient implementation-wise to add or set, and the calling code has to deal with it.
Returns
Normally returns NULL, but may return a non-NULL value for components which have the flag kUsesMemo set. This value will be passed into the corresponding Backprop routine.

Implements Component.

Definition at line 1741 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, CuMatrixBase< Real >::ColRange(), CuMatrixBase< Real >::CopyRowsFromVec(), kaldi::DeletePointers(), kaldi::kNoTrans, kaldi::kTrans, BlockAffineComponent::linear_params_, and BlockAffineComponent::num_blocks_.

1743  {
1744  out->CopyRowsFromVec(bias_params_);
1745  // block_dimension is both the number of columns, and the number of rows,
1746  // of a block.
1747  int32 num_rows_in_block = linear_params_.NumRows() / num_blocks_;
1748  int32 num_cols_in_block = linear_params_.NumCols();
1749  std::vector<CuSubMatrix<BaseFloat> *> in_batch, out_batch,
1750  linear_params_batch;
1751  for(int block_counter = 0; block_counter < num_blocks_; block_counter++) {
1752  CuSubMatrix<BaseFloat> *in_block =
1753  new CuSubMatrix<BaseFloat>(in.ColRange(block_counter * num_cols_in_block,
1754  num_cols_in_block));
1755  in_batch.push_back(in_block);
1756 
1757  CuSubMatrix<BaseFloat> *out_block =
1758  new CuSubMatrix<BaseFloat>(out->ColRange(block_counter * num_rows_in_block,
1759  num_rows_in_block));
1760  out_batch.push_back(out_block);
1761 
1762  CuSubMatrix<BaseFloat> *linear_params_block =
1763  new CuSubMatrix<BaseFloat>(linear_params_.RowRange(block_counter * num_rows_in_block,
1764  num_rows_in_block));
1765  linear_params_batch.push_back(linear_params_block);
1766  }
1767  AddMatMatBatched<BaseFloat>(1.0, out_batch, in_batch, kNoTrans,
1768  linear_params_batch, kTrans, 1.0);
1769 
1770  DeletePointers(&in_batch);
1771  DeletePointers(&out_batch);
1772  DeletePointers(&linear_params_batch);
1773  return NULL;
1774 }
void DeletePointers(std::vector< A *> *v)
Deletes any non-NULL pointers in the vector v, and sets the corresponding entries of v to NULL...
Definition: stl-utils.h:184
kaldi::int32 int32

◆ Properties()

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)
virtual

Read function (used after we know the type of the Component); accepts input that is missing the token that describes the component type, in case it has already been consumed.

Implements Component.

Definition at line 1898 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, kaldi::nnet3::ExpectToken(), UpdatableComponent::is_gradient_, BlockAffineComponent::linear_params_, BlockAffineComponent::num_blocks_, kaldi::PeekToken(), kaldi::ReadBasicType(), and UpdatableComponent::ReadUpdatableCommon().

1898  {
1899  ReadUpdatableCommon(is, binary); // read opening tag and learning rate.
1900  ExpectToken(is, binary, "<NumBlocks>");
1901  ReadBasicType(is, binary, &num_blocks_);
1902  ExpectToken(is, binary, "<LinearParams>");
1903  linear_params_.Read(is, binary);
1904  ExpectToken(is, binary, "<BiasParams>");
1905  bias_params_.Read(is, binary);
1906  if (PeekToken(is, binary) == 'I') {
1907  // for back compatibility; we don't write this here any
1908  // more as it's written and read in Write/ReadUpdatableCommon
1909  ExpectToken(is, binary, "<IsGradient>");
1910  ReadBasicType(is, binary, &is_gradient_);
1911  }
1912  ExpectToken(is, binary, "</BlockAffineComponent>");
1913 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
std::string ReadUpdatableCommon(std::istream &is, bool binary)
int PeekToken(std::istream &is, bool binary)
PeekToken will return the first character of the next token, or -1 if end of file.
Definition: io-funcs.cc:170
bool is_gradient_
True if this component is to be treated as a gradient rather than as parameters.

◆ Scale()

void Scale ( BaseFloat  scale)
virtual

This virtual function when called on – an UpdatableComponent scales the parameters by "scale" when called by an UpdatableComponent.

– a Nonlinear component (or another component that stores stats, like BatchNormComponent)– it relates to scaling activation stats, not parameters. Otherwise it will normally do nothing.

Reimplemented from Component.

Definition at line 1863 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, and BlockAffineComponent::linear_params_.

1863  {
1864  if (scale == 0.0) {
1865  linear_params_.SetZero();
1866  bias_params_.SetZero();
1867  } else {
1868  linear_params_.Scale(scale);
1869  bias_params_.Scale(scale);
1870  }
1871 }

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Returns a string such as "SigmoidComponent", describing the type of the object.

Implements Component.

Definition at line 514 of file nnet-simple-component.h.

Referenced by BlockAffineComponent::InitFromConfig().

514 { return "BlockAffineComponent"; }

◆ UnVectorize()

void UnVectorize ( const VectorBase< BaseFloat > &  params)
virtual

Converts the parameters from vector form.

Reimplemented from UpdatableComponent.

Definition at line 1938 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, VectorBase< Real >::Dim(), KALDI_ASSERT, BlockAffineComponent::linear_params_, BlockAffineComponent::NumParameters(), and VectorBase< Real >::Range().

1938  {
1939  KALDI_ASSERT(params.Dim() == this->NumParameters());
1940  int32 num_linear_params = linear_params_.NumCols() * linear_params_.NumRows();
1941  int32 num_bias_params = bias_params_.Dim();
1942  linear_params_.CopyRowsFromVec(params.Range(0, num_linear_params));
1943  bias_params_.CopyFromVec(params.Range(num_linear_params, num_bias_params));
1944 }
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Vectorize()

void Vectorize ( VectorBase< BaseFloat > *  params) const
virtual

Turns the parameters into vector form.

We put the vector form on the CPU, because in the kinds of situations where we do this, we'll tend to use too much memory for the GPU.

Reimplemented from UpdatableComponent.

Definition at line 1930 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, VectorBase< Real >::Dim(), KALDI_ASSERT, BlockAffineComponent::linear_params_, BlockAffineComponent::NumParameters(), and VectorBase< Real >::Range().

1930  {
1931  KALDI_ASSERT(params->Dim() == this->NumParameters());
1932  int32 num_linear_params = linear_params_.NumCols() * linear_params_.NumRows();
1933  int32 num_bias_params = bias_params_.Dim();
1934  params->Range(0, num_linear_params).CopyRowsFromMat(linear_params_);
1935  params->Range(num_linear_params, num_bias_params).CopyFromVec(bias_params_);
1936 }
virtual int32 NumParameters() const
The following new virtual function returns the total dimension of the parameters in this class...
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Implements Component.

Definition at line 1915 of file nnet-simple-component.cc.

References BlockAffineComponent::bias_params_, BlockAffineComponent::linear_params_, BlockAffineComponent::num_blocks_, kaldi::WriteBasicType(), kaldi::WriteToken(), and UpdatableComponent::WriteUpdatableCommon().

1915  {
1916  WriteUpdatableCommon(os, binary); // Write opening tag and learning rate
1917  WriteToken(os, binary, "<NumBlocks>");
1918  WriteBasicType(os, binary, num_blocks_);
1919  WriteToken(os, binary, "<LinearParams>");
1920  linear_params_.Write(os, binary);
1921  WriteToken(os, binary, "<BiasParams>");
1922  bias_params_.Write(os, binary);
1923  WriteToken(os, binary, "</BlockAffineComponent>");
1924 }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteUpdatableCommon(std::ostream &is, bool binary) const
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation

◆ bias_params_

◆ linear_params_

◆ num_blocks_


The documentation for this class was generated from the following files: