AffineComponentPreconditionedOnline Class Reference

Keywords: natural gradient descent, NG-SGD, naturalgradient. More...

#include <nnet-component.h>

Inheritance diagram for AffineComponentPreconditionedOnline:
Collaboration diagram for AffineComponentPreconditionedOnline:

Public Member Functions

virtual std::string Type () const
 
virtual void Read (std::istream &is, bool binary)
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
void Init (BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev, int32 rank_in, int32 rank_out, int32 update_period, BaseFloat num_samples_history, BaseFloat alpha, BaseFloat max_change_per_sample)
 
void Init (BaseFloat learning_rate, int32 rank_in, int32 rank_out, int32 update_period, BaseFloat num_samples_history, BaseFloat alpha, BaseFloat max_change_per_sample, std::string matrix_filename)
 
virtual void Resize (int32 input_dim, int32 output_dim)
 
 AffineComponentPreconditionedOnline (const AffineComponent &orig, int32 rank_in, int32 rank_out, int32 update_period, BaseFloat eta, BaseFloat alpha)
 
virtual void InitFromString (std::string args)
 Initialize, typically from a line of a config file. More...
 
virtual std::string Info () const
 
virtual ComponentCopy () const
 Copy component (deep copy). More...
 
 AffineComponentPreconditionedOnline ()
 
- Public Member Functions inherited from AffineComponent
 AffineComponent (const AffineComponent &other)
 
 AffineComponent (const CuMatrixBase< BaseFloat > &linear_params, const CuVectorBase< BaseFloat > &bias_params, BaseFloat learning_rate)
 
virtual int32 InputDim () const
 Get size of input vectors. More...
 
virtual int32 OutputDim () const
 Get size of output vectors. More...
 
void Init (BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev)
 
void Init (BaseFloat learning_rate, std::string matrix_filename)
 
ComponentCollapseWithNext (const AffineComponent &next) const
 
ComponentCollapseWithNext (const FixedAffineComponent &next) const
 
ComponentCollapseWithNext (const FixedScaleComponent &next) const
 
ComponentCollapseWithPrevious (const FixedAffineComponent &prev) const
 
 AffineComponent ()
 
virtual bool BackpropNeedsInput () const
 
virtual bool BackpropNeedsOutput () const
 
virtual void Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Perform forward pass propagation Input->Output. More...
 
virtual void Scale (BaseFloat scale)
 This new virtual function scales the parameters by this amount. More...
 
virtual void Add (BaseFloat alpha, const UpdatableComponent &other)
 This new virtual function adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void Backprop (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, Component *to_update, CuMatrix< BaseFloat > *in_deriv) const
 Perform backward pass propagation of the derivative, and also either update the model (if to_update == this) or update another model or compute the model derivative (otherwise). More...
 
virtual void SetZero (bool treat_as_gradient)
 Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set the learning rate to 1 and make any other changes necessary (there's a variable we have to set for the MixtureProbComponent). More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Here, "other" is a component of the same specific type. More...
 
virtual void PerturbParams (BaseFloat stddev)
 We introduce a new virtual function that only applies to class UpdatableComponent. More...
 
virtual void SetParams (const VectorBase< BaseFloat > &bias, const MatrixBase< BaseFloat > &linear)
 
const CuVector< BaseFloat > & BiasParams ()
 
const CuMatrix< BaseFloat > & LinearParams ()
 
virtual int32 GetParameterDim () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
virtual void LimitRank (int32 dimension, AffineComponent **a, AffineComponent **b) const
 This function is for getting a low-rank approximations of this AffineComponent by two AffineComponents. More...
 
void Widen (int32 new_dimension, BaseFloat param_stddev, BaseFloat bias_stddev, std::vector< NonlinearComponent *> c2, AffineComponent *c3)
 This function is implemented in widen-nnet.cc. More...
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
void Init (BaseFloat learning_rate)
 
 UpdatableComponent (BaseFloat learning_rate)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
void SetLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent. More...
 
BaseFloat LearningRate () const
 Gets the learning rate of gradient descent. More...
 
- Public Member Functions inherited from Component
 Component ()
 
virtual int32 Index () const
 Returns the index in the sequence of layers in the neural net; intended only to be used in debugging information. More...
 
virtual void SetIndex (int32 index)
 
virtual std::vector< int32Context () const
 Return a vector describing the temporal context this component requires for each frame of output, as a sorted list. More...
 
void Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out) const
 A non-virtual propagate function that first resizes output if necessary. More...
 
virtual ~Component ()
 

Private Member Functions

 KALDI_DISALLOW_COPY_AND_ASSIGN (AffineComponentPreconditionedOnline)
 
BaseFloat GetScalingFactor (const CuVectorBase< BaseFloat > &in_products, BaseFloat gamma_prod, CuVectorBase< BaseFloat > *out_products)
 The following function is only called if max_change_per_sample_ > 0, it returns a scaling factor alpha <= 1.0 (1.0 in the normal case) that enforces the "max-change" constraint. More...
 
void SetPreconditionerConfigs ()
 
virtual void Update (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 

Private Attributes

int32 rank_in_
 
int32 rank_out_
 
int32 update_period_
 
BaseFloat num_samples_history_
 
BaseFloat alpha_
 
OnlinePreconditioner preconditioner_in_
 
OnlinePreconditioner preconditioner_out_
 
BaseFloat max_change_per_sample_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream. More...
 
static ComponentNewFromString (const std::string &initializer_line)
 Initialize the Component from one line that will contain first the type, e.g. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Return a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from AffineComponent
virtual void UpdateSimple (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
const AffineComponentoperator= (const AffineComponent &other)
 
- Protected Attributes inherited from AffineComponent
CuMatrix< BaseFloatlinear_params_
 
CuVector< BaseFloatbias_params_
 
bool is_gradient_
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (0.0..0.01) More...
 

Detailed Description

Keywords: natural gradient descent, NG-SGD, naturalgradient.

For the top-level of the natural gradient code look here, and also in nnet-precondition-online.h. AffineComponentPreconditionedOnline is, like AffineComponentPreconditioned, a version of AffineComponent that has a non-(multiple of unit) learning-rate matrix. See nnet-precondition-online.h for a description of the technique.

Definition at line 997 of file nnet-component.h.

Constructor & Destructor Documentation

◆ AffineComponentPreconditionedOnline() [1/2]

AffineComponentPreconditionedOnline ( const AffineComponent orig,
int32  rank_in,
int32  rank_out,
int32  update_period,
BaseFloat  eta,
BaseFloat  alpha 
)

Definition at line 1728 of file nnet-component.cc.

References AffineComponentPreconditionedOnline::alpha_, AffineComponent::bias_params_, AffineComponent::is_gradient_, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, AffineComponentPreconditionedOnline::num_samples_history_, AffineComponentPreconditionedOnline::rank_in_, AffineComponentPreconditionedOnline::rank_out_, AffineComponentPreconditionedOnline::SetPreconditionerConfigs(), and AffineComponentPreconditionedOnline::update_period_.

1731  :
1732  max_change_per_sample_(0.1) {
1733  this->linear_params_ = orig.linear_params_;
1734  this->bias_params_ = orig.bias_params_;
1735  this->learning_rate_ = orig.learning_rate_;
1736  this->is_gradient_ = orig.is_gradient_;
1737  this->rank_in_ = rank_in;
1738  this->rank_out_ = rank_out;
1739  this->update_period_ = update_period;
1740  this->num_samples_history_ = num_samples_history;
1741  this->alpha_ = alpha;
1743 }
CuVector< BaseFloat > bias_params_
BaseFloat learning_rate_
learning rate (0.0..0.01)
CuMatrix< BaseFloat > linear_params_

◆ AffineComponentPreconditionedOnline() [2/2]

Member Function Documentation

◆ Copy()

Component * Copy ( ) const
virtual

Copy component (deep copy).

Reimplemented from AffineComponent.

Definition at line 1821 of file nnet-component.cc.

References AffineComponentPreconditionedOnline::AffineComponentPreconditionedOnline(), AffineComponentPreconditionedOnline::alpha_, AffineComponent::bias_params_, AffineComponent::is_gradient_, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, AffineComponentPreconditionedOnline::max_change_per_sample_, AffineComponentPreconditionedOnline::num_samples_history_, AffineComponentPreconditionedOnline::preconditioner_in_, AffineComponentPreconditionedOnline::preconditioner_out_, AffineComponentPreconditionedOnline::rank_in_, AffineComponentPreconditionedOnline::rank_out_, AffineComponentPreconditionedOnline::SetPreconditionerConfigs(), and AffineComponentPreconditionedOnline::update_period_.

1821  {
1823  ans->learning_rate_ = learning_rate_;
1824  ans->rank_in_ = rank_in_;
1825  ans->rank_out_ = rank_out_;
1826  ans->update_period_ = update_period_;
1827  ans->num_samples_history_ = num_samples_history_;
1828  ans->alpha_ = alpha_;
1829  ans->linear_params_ = linear_params_;
1830  ans->bias_params_ = bias_params_;
1831  ans->preconditioner_in_ = preconditioner_in_;
1832  ans->preconditioner_out_ = preconditioner_out_;
1833  ans->max_change_per_sample_ = max_change_per_sample_;
1834  ans->is_gradient_ = is_gradient_;
1835  ans->SetPreconditionerConfigs();
1836  return ans;
1837 }
CuVector< BaseFloat > bias_params_
BaseFloat learning_rate_
learning rate (0.0..0.01)
CuMatrix< BaseFloat > linear_params_

◆ GetScalingFactor()

BaseFloat GetScalingFactor ( const CuVectorBase< BaseFloat > &  in_products,
BaseFloat  gamma_prod,
CuVectorBase< BaseFloat > *  out_products 
)
private

The following function is only called if max_change_per_sample_ > 0, it returns a scaling factor alpha <= 1.0 (1.0 in the normal case) that enforces the "max-change" constraint.

"in_products" is the inner product with itself of each row of the matrix of preconditioned input features; "out_products" is the same for the output derivatives. gamma_prod is a product of two scalars that are output by the preconditioning code (for the input and output), which we will need to multiply into the learning rate. out_products is a pointer because we modify it in-place.

Definition at line 1841 of file nnet-component.cc.

References CuVectorBase< Real >::ApplyPow(), CuVectorBase< Real >::Dim(), Component::Index(), KALDI_ASSERT, KALDI_LOG, UpdatableComponent::learning_rate_, AffineComponentPreconditionedOnline::max_change_per_sample_, CuVectorBase< Real >::MulElements(), and CuVectorBase< Real >::Sum().

Referenced by AffineComponentPreconditionedOnline::Update().

1844  {
1845  static int scaling_factor_printed = 0;
1846  int32 minibatch_size = in_products.Dim();
1847 
1848  out_products->MulElements(in_products);
1849  out_products->ApplyPow(0.5);
1850  BaseFloat prod_sum = out_products->Sum();
1851  BaseFloat tot_change_norm = learning_rate_scale * learning_rate_ * prod_sum,
1852  max_change_norm = max_change_per_sample_ * minibatch_size;
1853  // tot_change_norm is the product of norms that we are trying to limit
1854  // to max_value_.
1855  KALDI_ASSERT(tot_change_norm - tot_change_norm == 0.0 && "NaN in backprop");
1856  KALDI_ASSERT(tot_change_norm >= 0.0);
1857  if (tot_change_norm <= max_change_norm) return 1.0;
1858  else {
1859  BaseFloat factor = max_change_norm / tot_change_norm;
1860  if (scaling_factor_printed < 10) {
1861  KALDI_LOG << "Limiting step size using scaling factor "
1862  << factor << ", for component index " << Index();
1863  scaling_factor_printed++;
1864  }
1865  return factor;
1866  }
1867 }
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
BaseFloat learning_rate_
learning rate (0.0..0.01)
virtual int32 Index() const
Returns the index in the sequence of layers in the neural net; intended only to be used in debugging ...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define KALDI_LOG
Definition: kaldi-error.h:153

◆ Info()

std::string Info ( ) const
virtual

Reimplemented from AffineComponent.

Definition at line 1798 of file nnet-component.cc.

References AffineComponentPreconditionedOnline::alpha_, AffineComponent::bias_params_, AffineComponent::InputDim(), kaldi::kTrans, UpdatableComponent::LearningRate(), AffineComponent::linear_params_, AffineComponentPreconditionedOnline::max_change_per_sample_, AffineComponentPreconditionedOnline::num_samples_history_, AffineComponent::OutputDim(), AffineComponentPreconditionedOnline::rank_in_, AffineComponentPreconditionedOnline::rank_out_, kaldi::TraceMatMat(), AffineComponentPreconditionedOnline::Type(), AffineComponentPreconditionedOnline::update_period_, and kaldi::VecVec().

1798  {
1799  std::stringstream stream;
1800  BaseFloat linear_params_size = static_cast<BaseFloat>(linear_params_.NumRows())
1801  * static_cast<BaseFloat>(linear_params_.NumCols());
1802  BaseFloat linear_stddev =
1804  linear_params_size),
1805  bias_stddev = std::sqrt(VecVec(bias_params_, bias_params_) /
1806  bias_params_.Dim());
1807  stream << Type() << ", input-dim=" << InputDim()
1808  << ", output-dim=" << OutputDim()
1809  << ", linear-params-stddev=" << linear_stddev
1810  << ", bias-params-stddev=" << bias_stddev
1811  << ", learning-rate=" << LearningRate()
1812  << ", rank-in=" << rank_in_
1813  << ", rank-out=" << rank_out_
1814  << ", num_samples_history=" << num_samples_history_
1815  << ", update_period=" << update_period_
1816  << ", alpha=" << alpha_
1817  << ", max-change-per-sample=" << max_change_per_sample_;
1818  return stream.str();
1819 }
CuVector< BaseFloat > bias_params_
float BaseFloat
Definition: kaldi-types.h:29
virtual int32 InputDim() const
Get size of input vectors.
virtual int32 OutputDim() const
Get size of output vectors.
Real TraceMatMat(const MatrixBase< Real > &A, const MatrixBase< Real > &B, MatrixTransposeType trans)
We need to declare this here as it will be a friend function.
CuMatrix< BaseFloat > linear_params_
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:37
BaseFloat LearningRate() const
Gets the learning rate of gradient descent.

◆ Init() [1/2]

void Init ( BaseFloat  learning_rate,
int32  input_dim,
int32  output_dim,
BaseFloat  param_stddev,
BaseFloat  bias_stddev,
int32  rank_in,
int32  rank_out,
int32  update_period,
BaseFloat  num_samples_history,
BaseFloat  alpha,
BaseFloat  max_change_per_sample 
)

Definition at line 1745 of file nnet-component.cc.

References AffineComponentPreconditionedOnline::alpha_, AffineComponent::bias_params_, UpdatableComponent::Init(), KALDI_ASSERT, AffineComponent::linear_params_, AffineComponentPreconditionedOnline::max_change_per_sample_, AffineComponentPreconditionedOnline::num_samples_history_, AffineComponentPreconditionedOnline::rank_in_, AffineComponentPreconditionedOnline::rank_out_, AffineComponentPreconditionedOnline::SetPreconditionerConfigs(), and AffineComponentPreconditionedOnline::update_period_.

Referenced by SpliceComponent::InitFromString(), SpliceMaxComponent::InitFromString(), BlockAffineComponent::InitFromString(), BlockAffineComponentPreconditioned::InitFromString(), SumGroupComponent::InitFromString(), PermuteComponent::InitFromString(), DctComponent::InitFromString(), FixedLinearComponent::InitFromString(), FixedAffineComponent::InitFromString(), FixedScaleComponent::InitFromString(), FixedBiasComponent::InitFromString(), DropoutComponent::InitFromString(), AdditiveNoiseComponent::InitFromString(), SumGroupComponent::Read(), DctComponent::Read(), and kaldi::nnet2::UnitTestAffineComponentPreconditionedOnline().

1751  {
1752  UpdatableComponent::Init(learning_rate);
1753  linear_params_.Resize(output_dim, input_dim);
1754  bias_params_.Resize(output_dim);
1755  KALDI_ASSERT(output_dim > 0 && input_dim > 0 && param_stddev >= 0.0 &&
1756  bias_stddev >= 0.0);
1757  linear_params_.SetRandn(); // sets to random normally distributed noise.
1758  linear_params_.Scale(param_stddev);
1759  bias_params_.SetRandn();
1760  bias_params_.Scale(bias_stddev);
1761  rank_in_ = rank_in;
1762  rank_out_ = rank_out;
1763  update_period_ = update_period;
1764  num_samples_history_ = num_samples_history;
1765  alpha_ = alpha;
1767  KALDI_ASSERT(max_change_per_sample >= 0.0);
1768  max_change_per_sample_ = max_change_per_sample;
1769 }
CuVector< BaseFloat > bias_params_
void Init(BaseFloat learning_rate)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
CuMatrix< BaseFloat > linear_params_

◆ Init() [2/2]

void Init ( BaseFloat  learning_rate,
int32  rank_in,
int32  rank_out,
int32  update_period,
BaseFloat  num_samples_history,
BaseFloat  alpha,
BaseFloat  max_change_per_sample,
std::string  matrix_filename 
)

Definition at line 1704 of file nnet-component.cc.

References AffineComponent::bias_params_, UpdatableComponent::Init(), KALDI_ASSERT, AffineComponent::linear_params_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), CuMatrixBase< Real >::Range(), and kaldi::ReadKaldiObject().

1708  {
1709  UpdatableComponent::Init(learning_rate);
1710  rank_in_ = rank_in;
1711  rank_out_ = rank_out;
1712  update_period_ = update_period;
1713  num_samples_history_ = num_samples_history;
1714  alpha_ = alpha;
1716  KALDI_ASSERT(max_change_per_sample >= 0.0);
1717  max_change_per_sample_ = max_change_per_sample;
1718  CuMatrix<BaseFloat> mat;
1719  ReadKaldiObject(matrix_filename, &mat); // will abort on failure.
1720  KALDI_ASSERT(mat.NumCols() >= 2);
1721  int32 input_dim = mat.NumCols() - 1, output_dim = mat.NumRows();
1722  linear_params_.Resize(output_dim, input_dim);
1723  bias_params_.Resize(output_dim);
1724  linear_params_.CopyFromMat(mat.Range(0, output_dim, 0, input_dim));
1725  bias_params_.CopyColFromMat(mat, input_dim);
1726 }
CuVector< BaseFloat > bias_params_
kaldi::int32 int32
void ReadKaldiObject(const std::string &filename, Matrix< float > *m)
Definition: kaldi-io.cc:832
void Init(BaseFloat learning_rate)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
CuMatrix< BaseFloat > linear_params_

◆ InitFromString()

void InitFromString ( std::string  args)
virtual

Initialize, typically from a line of a config file.

The "args" will contain any parameters that need to be passed to the Component, e.g. dimensions.

Reimplemented from AffineComponent.

Definition at line 1648 of file nnet-component.cc.

References AffineComponent::Init(), AffineComponent::InputDim(), KALDI_ASSERT, KALDI_ERR, UpdatableComponent::learning_rate_, AffineComponent::OutputDim(), and kaldi::nnet2::ParseFromString().

Referenced by kaldi::nnet2::UnitTestAffineComponentPreconditionedOnline().

1648  {
1649  std::string orig_args(args);
1650  bool ok = true;
1651  std::string matrix_filename;
1652  BaseFloat learning_rate = learning_rate_;
1653  BaseFloat num_samples_history = 2000.0, alpha = 4.0,
1654  max_change_per_sample = 0.1;
1655  int32 input_dim = -1, output_dim = -1, rank_in = 30, rank_out = 80,
1656  update_period = 1;
1657  ParseFromString("learning-rate", &args, &learning_rate); // optional.
1658  ParseFromString("num-samples-history", &args, &num_samples_history);
1659  ParseFromString("alpha", &args, &alpha);
1660  ParseFromString("max-change-per-sample", &args, &max_change_per_sample);
1661  ParseFromString("rank-in", &args, &rank_in);
1662  ParseFromString("rank-out", &args, &rank_out);
1663  ParseFromString("update-period", &args, &update_period);
1664 
1665  if (ParseFromString("matrix", &args, &matrix_filename)) {
1666  Init(learning_rate, rank_in, rank_out, update_period,
1667  num_samples_history, alpha, max_change_per_sample,
1668  matrix_filename);
1669  if (ParseFromString("input-dim", &args, &input_dim))
1670  KALDI_ASSERT(input_dim == InputDim() &&
1671  "input-dim mismatch vs. matrix.");
1672  if (ParseFromString("output-dim", &args, &output_dim))
1673  KALDI_ASSERT(output_dim == OutputDim() &&
1674  "output-dim mismatch vs. matrix.");
1675  } else {
1676  ok = ok && ParseFromString("input-dim", &args, &input_dim);
1677  ok = ok && ParseFromString("output-dim", &args, &output_dim);
1678  BaseFloat param_stddev = 1.0 / std::sqrt(input_dim),
1679  bias_stddev = 1.0;
1680  ParseFromString("param-stddev", &args, &param_stddev);
1681  ParseFromString("bias-stddev", &args, &bias_stddev);
1682  Init(learning_rate, input_dim, output_dim, param_stddev,
1683  bias_stddev, rank_in, rank_out, update_period,
1684  num_samples_history, alpha, max_change_per_sample);
1685  }
1686  if (!args.empty())
1687  KALDI_ERR << "Could not process these elements in initializer: "
1688  << args;
1689  if (!ok)
1690  KALDI_ERR << "Bad initializer " << orig_args;
1691 }
kaldi::int32 int32
bool ParseFromString(const std::string &name, std::string *string, int32 *param)
Functions used in Init routines.
void Init(BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev, int32 rank_in, int32 rank_out, int32 update_period, BaseFloat num_samples_history, BaseFloat alpha, BaseFloat max_change_per_sample)
float BaseFloat
Definition: kaldi-types.h:29
virtual int32 InputDim() const
Get size of input vectors.
BaseFloat learning_rate_
learning rate (0.0..0.01)
#define KALDI_ERR
Definition: kaldi-error.h:147
virtual int32 OutputDim() const
Get size of output vectors.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( AffineComponentPreconditionedOnline  )
private

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)
virtual

Reimplemented from AffineComponent.

Definition at line 1608 of file nnet-component.cc.

References AffineComponent::bias_params_, kaldi::nnet2::ExpectOneOrTwoTokens(), kaldi::ExpectToken(), KALDI_ASSERT, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, kaldi::ReadBasicType(), kaldi::ReadToken(), and AffineComponent::Type().

1608  {
1609  std::ostringstream ostr_beg, ostr_end;
1610  ostr_beg << "<" << Type() << ">";
1611  ostr_end << "</" << Type() << ">";
1612  // might not see the "<AffineComponentPreconditionedOnline>" part because
1613  // of how ReadNew() works.
1614  ExpectOneOrTwoTokens(is, binary, ostr_beg.str(), "<LearningRate>");
1615  ReadBasicType(is, binary, &learning_rate_);
1616  ExpectToken(is, binary, "<LinearParams>");
1617  linear_params_.Read(is, binary);
1618  ExpectToken(is, binary, "<BiasParams>");
1619  bias_params_.Read(is, binary);
1620  std::string tok;
1621  ReadToken(is, binary, &tok);
1622  if (tok == "<Rank>") { // back-compatibility (temporary)
1623  ReadBasicType(is, binary, &rank_in_);
1624  rank_out_ = rank_in_;
1625  } else {
1626  KALDI_ASSERT(tok == "<RankIn>");
1627  ReadBasicType(is, binary, &rank_in_);
1628  ExpectToken(is, binary, "<RankOut>");
1629  ReadBasicType(is, binary, &rank_out_);
1630  }
1631  ReadToken(is, binary, &tok);
1632  if (tok == "<UpdatePeriod>") {
1633  ReadBasicType(is, binary, &update_period_);
1634  ExpectToken(is, binary, "<NumSamplesHistory>");
1635  } else {
1636  update_period_ = 1;
1637  KALDI_ASSERT(tok == "<NumSamplesHistory>");
1638  }
1639  ReadBasicType(is, binary, &num_samples_history_);
1640  ExpectToken(is, binary, "<Alpha>");
1641  ReadBasicType(is, binary, &alpha_);
1642  ExpectToken(is, binary, "<MaxChangePerSample>");
1643  ReadBasicType(is, binary, &max_change_per_sample_);
1644  ExpectToken(is, binary, ostr_end.str());
1646 }
CuVector< BaseFloat > bias_params_
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:191
BaseFloat learning_rate_
learning rate (0.0..0.01)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static void ExpectOneOrTwoTokens(std::istream &is, bool binary, const std::string &token1, const std::string &token2)
CuMatrix< BaseFloat > linear_params_

◆ Resize()

void Resize ( int32  input_dim,
int32  output_dim 
)
virtual

Reimplemented from AffineComponent.

Definition at line 1594 of file nnet-component.cc.

References AffineComponent::bias_params_, KALDI_ASSERT, and AffineComponent::linear_params_.

1595  {
1596  KALDI_ASSERT(input_dim > 1 && output_dim > 1);
1597  if (rank_in_ >= input_dim) rank_in_ = input_dim - 1;
1598  if (rank_out_ >= output_dim) rank_out_ = output_dim - 1;
1599  bias_params_.Resize(output_dim);
1600  linear_params_.Resize(output_dim, input_dim);
1601  OnlinePreconditioner temp;
1602  preconditioner_in_ = temp;
1603  preconditioner_out_ = temp;
1605 }
CuVector< BaseFloat > bias_params_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
CuMatrix< BaseFloat > linear_params_

◆ SetPreconditionerConfigs()

void SetPreconditionerConfigs ( )
private

Definition at line 1693 of file nnet-component.cc.

Referenced by AffineComponentPreconditionedOnline::AffineComponentPreconditionedOnline(), AffineComponentPreconditionedOnline::Copy(), and AffineComponentPreconditionedOnline::Init().

1693  {
1702 }
void SetNumSamplesHistory(BaseFloat num_samples_history)

◆ Type()

◆ Update()

void Update ( const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv 
)
privatevirtual

Reimplemented from AffineComponent.

Definition at line 1869 of file nnet-component.cc.

References AffineComponent::bias_params_, CuVectorBase< Real >::CopyColFromMat(), AffineComponentPreconditionedOnline::GetScalingFactor(), kaldi::kNoTrans, kaldi::kTrans, kaldi::kUndefined, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, AffineComponentPreconditionedOnline::max_change_per_sample_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), OnlinePreconditioner::PreconditionDirections(), AffineComponentPreconditionedOnline::preconditioner_in_, AffineComponentPreconditionedOnline::preconditioner_out_, CuMatrixBase< Real >::Range(), and CuMatrix< Real >::Resize().

1871  {
1872  CuMatrix<BaseFloat> in_value_temp;
1873 
1874  in_value_temp.Resize(in_value.NumRows(),
1875  in_value.NumCols() + 1, kUndefined);
1876  in_value_temp.Range(0, in_value.NumRows(),
1877  0, in_value.NumCols()).CopyFromMat(in_value);
1878 
1879  // Add the 1.0 at the end of each row "in_value_temp"
1880  in_value_temp.Range(0, in_value.NumRows(),
1881  in_value.NumCols(), 1).Set(1.0);
1882 
1883  CuMatrix<BaseFloat> out_deriv_temp(out_deriv);
1884 
1885  CuMatrix<BaseFloat> row_products(2,
1886  in_value.NumRows());
1887  CuSubVector<BaseFloat> in_row_products(row_products, 0),
1888  out_row_products(row_products, 1);
1889 
1890  // These "scale" values get will get multiplied into the learning rate (faster
1891  // than having the matrices scaled inside the preconditioning code).
1892  BaseFloat in_scale, out_scale;
1893 
1894  preconditioner_in_.PreconditionDirections(&in_value_temp, &in_row_products,
1895  &in_scale);
1896  preconditioner_out_.PreconditionDirections(&out_deriv_temp, &out_row_products,
1897  &out_scale);
1898 
1899  // "scale" is a scaling factor coming from the PreconditionDirections calls
1900  // (it's faster to have them output a scaling factor than to have them scale
1901  // their outputs).
1902  BaseFloat scale = in_scale * out_scale;
1903  BaseFloat minibatch_scale = 1.0;
1904 
1905  if (max_change_per_sample_ > 0.0)
1906  minibatch_scale = GetScalingFactor(in_row_products, scale,
1907  &out_row_products);
1908 
1909  CuSubMatrix<BaseFloat> in_value_precon_part(in_value_temp,
1910  0, in_value_temp.NumRows(),
1911  0, in_value_temp.NumCols() - 1);
1912  // this "precon_ones" is what happens to the vector of 1's representing
1913  // offsets, after multiplication by the preconditioner.
1914  CuVector<BaseFloat> precon_ones(in_value_temp.NumRows());
1915 
1916  precon_ones.CopyColFromMat(in_value_temp, in_value_temp.NumCols() - 1);
1917 
1918  BaseFloat local_lrate = scale * minibatch_scale * learning_rate_;
1919  bias_params_.AddMatVec(local_lrate, out_deriv_temp, kTrans,
1920  precon_ones, 1.0);
1921  linear_params_.AddMatMat(local_lrate, out_deriv_temp, kTrans,
1922  in_value_precon_part, kNoTrans, 1.0);
1923 }
CuVector< BaseFloat > bias_params_
BaseFloat GetScalingFactor(const CuVectorBase< BaseFloat > &in_products, BaseFloat gamma_prod, CuVectorBase< BaseFloat > *out_products)
The following function is only called if max_change_per_sample_ > 0, it returns a scaling factor alph...
void PreconditionDirections(CuMatrixBase< BaseFloat > *R, CuVectorBase< BaseFloat > *row_prod, BaseFloat *scale)
float BaseFloat
Definition: kaldi-types.h:29
BaseFloat learning_rate_
learning rate (0.0..0.01)
CuMatrix< BaseFloat > linear_params_

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Reimplemented from AffineComponent.

Definition at line 1772 of file nnet-component.cc.

References AffineComponentPreconditionedOnline::alpha_, AffineComponent::bias_params_, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, AffineComponentPreconditionedOnline::max_change_per_sample_, AffineComponentPreconditionedOnline::num_samples_history_, AffineComponentPreconditionedOnline::rank_in_, AffineComponentPreconditionedOnline::rank_out_, AffineComponentPreconditionedOnline::Type(), AffineComponentPreconditionedOnline::update_period_, kaldi::WriteBasicType(), and kaldi::WriteToken().

1772  {
1773  std::ostringstream ostr_beg, ostr_end;
1774  ostr_beg << "<" << Type() << ">"; // e.g. "<AffineComponent>"
1775  ostr_end << "</" << Type() << ">"; // e.g. "</AffineComponent>"
1776  WriteToken(os, binary, ostr_beg.str());
1777  WriteToken(os, binary, "<LearningRate>");
1778  WriteBasicType(os, binary, learning_rate_);
1779  WriteToken(os, binary, "<LinearParams>");
1780  linear_params_.Write(os, binary);
1781  WriteToken(os, binary, "<BiasParams>");
1782  bias_params_.Write(os, binary);
1783  WriteToken(os, binary, "<RankIn>");
1784  WriteBasicType(os, binary, rank_in_);
1785  WriteToken(os, binary, "<RankOut>");
1786  WriteBasicType(os, binary, rank_out_);
1787  WriteToken(os, binary, "<UpdatePeriod>");
1788  WriteBasicType(os, binary, update_period_);
1789  WriteToken(os, binary, "<NumSamplesHistory>");
1790  WriteBasicType(os, binary, num_samples_history_);
1791  WriteToken(os, binary, "<Alpha>");
1792  WriteBasicType(os, binary, alpha_);
1793  WriteToken(os, binary, "<MaxChangePerSample>");
1795  WriteToken(os, binary, ostr_end.str());
1796 }
CuVector< BaseFloat > bias_params_
BaseFloat learning_rate_
learning rate (0.0..0.01)
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34
CuMatrix< BaseFloat > linear_params_

Member Data Documentation

◆ alpha_

◆ max_change_per_sample_

◆ num_samples_history_

◆ preconditioner_in_

◆ preconditioner_out_

◆ rank_in_

◆ rank_out_

◆ update_period_


The documentation for this class was generated from the following files: