All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
AffineComponentPreconditioned Class Reference

#include <nnet-component.h>

Inheritance diagram for AffineComponentPreconditioned:
Collaboration diagram for AffineComponentPreconditioned:

Public Member Functions

virtual std::string Type () const
 
virtual void Read (std::istream &is, bool binary)
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
void Init (BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev, BaseFloat alpha, BaseFloat max_change)
 
void Init (BaseFloat learning_rate, BaseFloat alpha, BaseFloat max_change, std::string matrix_filename)
 
virtual void InitFromString (std::string args)
 Initialize, typically from a line of a config file. More...
 
virtual std::string Info () const
 
virtual ComponentCopy () const
 Copy component (deep copy). More...
 
 AffineComponentPreconditioned ()
 
void SetMaxChange (BaseFloat max_change)
 
- Public Member Functions inherited from AffineComponent
 AffineComponent (const AffineComponent &other)
 
 AffineComponent (const CuMatrixBase< BaseFloat > &linear_params, const CuVectorBase< BaseFloat > &bias_params, BaseFloat learning_rate)
 
virtual int32 InputDim () const
 Get size of input vectors. More...
 
virtual int32 OutputDim () const
 Get size of output vectors. More...
 
void Init (BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev)
 
void Init (BaseFloat learning_rate, std::string matrix_filename)
 
virtual void Resize (int32 input_dim, int32 output_dim)
 
ComponentCollapseWithNext (const AffineComponent &next) const
 
ComponentCollapseWithNext (const FixedAffineComponent &next) const
 
ComponentCollapseWithNext (const FixedScaleComponent &next) const
 
ComponentCollapseWithPrevious (const FixedAffineComponent &prev) const
 
 AffineComponent ()
 
virtual bool BackpropNeedsInput () const
 
virtual bool BackpropNeedsOutput () const
 
virtual void Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Perform forward pass propagation Input->Output. More...
 
virtual void Scale (BaseFloat scale)
 This new virtual function scales the parameters by this amount. More...
 
virtual void Add (BaseFloat alpha, const UpdatableComponent &other)
 This new virtual function adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
virtual void Backprop (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, Component *to_update, CuMatrix< BaseFloat > *in_deriv) const
 Perform backward pass propagation of the derivative, and also either update the model (if to_update == this) or update another model or compute the model derivative (otherwise). More...
 
virtual void SetZero (bool treat_as_gradient)
 Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set the learning rate to 1 and make any other changes necessary (there's a variable we have to set for the MixtureProbComponent). More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Here, "other" is a component of the same specific type. More...
 
virtual void PerturbParams (BaseFloat stddev)
 We introduce a new virtual function that only applies to class UpdatableComponent. More...
 
virtual void SetParams (const VectorBase< BaseFloat > &bias, const MatrixBase< BaseFloat > &linear)
 
const CuVector< BaseFloat > & BiasParams ()
 
const CuMatrix< BaseFloat > & LinearParams ()
 
virtual int32 GetParameterDim () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
virtual void LimitRank (int32 dimension, AffineComponent **a, AffineComponent **b) const
 This function is for getting a low-rank approximations of this AffineComponent by two AffineComponents. More...
 
void Widen (int32 new_dimension, BaseFloat param_stddev, BaseFloat bias_stddev, std::vector< NonlinearComponent * > c2, AffineComponent *c3)
 This function is implemented in widen-nnet.cc. More...
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
void Init (BaseFloat learning_rate)
 
 UpdatableComponent (BaseFloat learning_rate)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
void SetLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent. More...
 
BaseFloat LearningRate () const
 Gets the learning rate of gradient descent. More...
 
- Public Member Functions inherited from Component
 Component ()
 
virtual int32 Index () const
 Returns the index in the sequence of layers in the neural net; intended only to be used in debugging information. More...
 
virtual void SetIndex (int32 index)
 
virtual std::vector< int32 > Context () const
 Return a vector describing the temporal context this component requires for each frame of output, as a sorted list. More...
 
void Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out) const
 A non-virtual propagate function that first resizes output if necessary. More...
 
virtual ~Component ()
 

Protected Member Functions

 KALDI_DISALLOW_COPY_AND_ASSIGN (AffineComponentPreconditioned)
 
BaseFloat GetScalingFactor (const CuMatrix< BaseFloat > &in_value_precon, const CuMatrix< BaseFloat > &out_deriv_precon)
 The following function is only called if max_change_ > 0. More...
 
virtual void Update (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
- Protected Member Functions inherited from AffineComponent
virtual void UpdateSimple (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
const AffineComponentoperator= (const AffineComponent &other)
 

Protected Attributes

BaseFloat alpha_
 
BaseFloat max_change_
 
- Protected Attributes inherited from AffineComponent
CuMatrix< BaseFloatlinear_params_
 
CuVector< BaseFloatbias_params_
 
bool is_gradient_
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (0.0..0.01) More...
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream. More...
 
static ComponentNewFromString (const std::string &initializer_line)
 Initialize the Component from one line that will contain first the type, e.g. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Return a new Component of the given type e.g. More...
 

Detailed Description

Definition at line 948 of file nnet-component.h.

Constructor & Destructor Documentation

Member Function Documentation

Component * Copy ( ) const
virtual

Copy component (deep copy).

Reimplemented from AffineComponent.

Definition at line 1500 of file nnet-component.cc.

References AffineComponentPreconditioned::AffineComponentPreconditioned(), AffineComponentPreconditioned::alpha_, AffineComponent::bias_params_, AffineComponent::is_gradient_, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, and AffineComponentPreconditioned::max_change_.

1500  {
1502  ans->learning_rate_ = learning_rate_;
1503  ans->linear_params_ = linear_params_;
1504  ans->bias_params_ = bias_params_;
1505  ans->alpha_ = alpha_;
1506  ans->max_change_ = max_change_;
1507  ans->is_gradient_ = is_gradient_;
1508  return ans;
1509 }
CuVector< BaseFloat > bias_params_
BaseFloat learning_rate_
learning rate (0.0..0.01)
CuMatrix< BaseFloat > linear_params_
BaseFloat GetScalingFactor ( const CuMatrix< BaseFloat > &  in_value_precon,
const CuMatrix< BaseFloat > &  out_deriv_precon 
)
protected

The following function is only called if max_change_ > 0.

It returns the greatest value alpha <= 1.0 such that (alpha times the sum over the row-index of the two matrices of the product the l2 norms of the two rows times learning_rate_) is <= max_change.

Definition at line 1512 of file nnet-component.cc.

References CuVectorBase< Real >::AddDiagMat2(), Component::Index(), KALDI_ASSERT, KALDI_LOG, kaldi::kNoTrans, UpdatableComponent::learning_rate_, AffineComponentPreconditioned::max_change_, CuMatrixBase< Real >::NumRows(), and kaldi::VecVec().

Referenced by AffineComponentPreconditioned::Update().

1514  {
1515  static int scaling_factor_printed = 0;
1516 
1517  KALDI_ASSERT(in_value_precon.NumRows() == out_deriv_precon.NumRows());
1518  CuVector<BaseFloat> in_norm(in_value_precon.NumRows()),
1519  out_deriv_norm(in_value_precon.NumRows());
1520  in_norm.AddDiagMat2(1.0, in_value_precon, kNoTrans, 0.0);
1521  out_deriv_norm.AddDiagMat2(1.0, out_deriv_precon, kNoTrans, 0.0);
1522  // Get the actual l2 norms, not the squared l2 norm.
1523  in_norm.ApplyPow(0.5);
1524  out_deriv_norm.ApplyPow(0.5);
1525  BaseFloat sum = learning_rate_ * VecVec(in_norm, out_deriv_norm);
1526  // sum is the product of norms that we are trying to limit
1527  // to max_value_.
1528  KALDI_ASSERT(sum == sum && sum - sum == 0.0 &&
1529  "NaN in backprop");
1530  KALDI_ASSERT(sum >= 0.0);
1531  if (sum <= max_change_) return 1.0;
1532  else {
1533  BaseFloat ans = max_change_ / sum;
1534  if (scaling_factor_printed < 10) {
1535  KALDI_LOG << "Limiting step size to " << max_change_
1536  << " using scaling factor " << ans << ", for component index "
1537  << Index();
1538  scaling_factor_printed++;
1539  }
1540  return ans;
1541  }
1542 }
float BaseFloat
Definition: kaldi-types.h:29
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
BaseFloat learning_rate_
learning rate (0.0..0.01)
virtual int32 Index() const
Returns the index in the sequence of layers in the neural net; intended only to be used in debugging ...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
#define KALDI_LOG
Definition: kaldi-error.h:133
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:36
std::string Info ( ) const
virtual

Reimplemented from AffineComponent.

Definition at line 1481 of file nnet-component.cc.

References AffineComponentPreconditioned::alpha_, AffineComponent::bias_params_, CuVectorBase< Real >::Dim(), AffineComponent::InputDim(), kaldi::kTrans, UpdatableComponent::LearningRate(), AffineComponent::linear_params_, AffineComponentPreconditioned::max_change_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), AffineComponent::OutputDim(), kaldi::TraceMatMat(), AffineComponentPreconditioned::Type(), and kaldi::VecVec().

1481  {
1482  std::stringstream stream;
1483  BaseFloat linear_params_size = static_cast<BaseFloat>(linear_params_.NumRows())
1484  * static_cast<BaseFloat>(linear_params_.NumCols());
1485  BaseFloat linear_stddev =
1487  linear_params_size),
1488  bias_stddev = std::sqrt(VecVec(bias_params_, bias_params_) /
1489  bias_params_.Dim());
1490  stream << Type() << ", input-dim=" << InputDim()
1491  << ", output-dim=" << OutputDim()
1492  << ", linear-params-stddev=" << linear_stddev
1493  << ", bias-params-stddev=" << bias_stddev
1494  << ", learning-rate=" << LearningRate()
1495  << ", alpha=" << alpha_
1496  << ", max-change=" << max_change_;
1497  return stream.str();
1498 }
CuVector< BaseFloat > bias_params_
virtual int32 OutputDim() const
Get size of output vectors.
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
virtual int32 InputDim() const
Get size of input vectors.
BaseFloat LearningRate() const
Gets the learning rate of gradient descent.
float BaseFloat
Definition: kaldi-types.h:29
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
Real TraceMatMat(const MatrixBase< Real > &A, const MatrixBase< Real > &B, MatrixTransposeType trans)
We need to declare this here as it will be a friend function.
CuMatrix< BaseFloat > linear_params_
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:36
void Init ( BaseFloat  learning_rate,
int32  input_dim,
int32  output_dim,
BaseFloat  param_stddev,
BaseFloat  bias_stddev,
BaseFloat  alpha,
BaseFloat  max_change 
)

Definition at line 1442 of file nnet-component.cc.

References AffineComponentPreconditioned::alpha_, AffineComponent::bias_params_, UpdatableComponent::Init(), KALDI_ASSERT, AffineComponent::linear_params_, AffineComponentPreconditioned::max_change_, CuVector< Real >::Resize(), CuMatrix< Real >::Resize(), CuVectorBase< Real >::Scale(), CuMatrixBase< Real >::Scale(), CuVectorBase< Real >::SetRandn(), and CuMatrixBase< Real >::SetRandn().

Referenced by AffineComponentPreconditioned::InitFromString(), and kaldi::nnet2::UnitTestAffineComponentPreconditioned().

1446  {
1447  UpdatableComponent::Init(learning_rate);
1448  KALDI_ASSERT(input_dim > 0 && output_dim > 0);
1449  linear_params_.Resize(output_dim, input_dim);
1450  bias_params_.Resize(output_dim);
1451  KALDI_ASSERT(output_dim > 0 && input_dim > 0 && param_stddev >= 0.0);
1452  linear_params_.SetRandn(); // sets to random normally distributed noise.
1453  linear_params_.Scale(param_stddev);
1455  bias_params_.Scale(bias_stddev);
1456  alpha_ = alpha;
1457  KALDI_ASSERT(alpha_ > 0.0);
1458  max_change_ = max_change; // Note: any value of max_change_is valid, but
1459  // only values > 0.0 will actually activate the code.
1460 }
void Scale(Real value)
Definition: cu-vector.cc:1105
CuVector< BaseFloat > bias_params_
void Scale(Real value)
Definition: cu-matrix.cc:608
void Resize(MatrixIndexT dim, MatrixResizeType t=kSetZero)
Allocate the memory.
Definition: cu-vector.cc:892
void Init(BaseFloat learning_rate)
void Resize(MatrixIndexT rows, MatrixIndexT cols, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Allocate the memory.
Definition: cu-matrix.cc:47
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
CuMatrix< BaseFloat > linear_params_
void Init ( BaseFloat  learning_rate,
BaseFloat  alpha,
BaseFloat  max_change,
std::string  matrix_filename 
)

Definition at line 1426 of file nnet-component.cc.

References AffineComponentPreconditioned::alpha_, AffineComponent::bias_params_, CuVectorBase< Real >::CopyColFromMat(), CuMatrixBase< Real >::CopyFromMat(), UpdatableComponent::Init(), KALDI_ASSERT, AffineComponent::linear_params_, AffineComponentPreconditioned::max_change_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), CuMatrixBase< Real >::Range(), kaldi::ReadKaldiObject(), CuVector< Real >::Resize(), and CuMatrix< Real >::Resize().

1428  {
1429  UpdatableComponent::Init(learning_rate);
1430  alpha_ = alpha;
1431  max_change_ = max_change;
1432  CuMatrix<BaseFloat> mat;
1433  ReadKaldiObject(matrix_filename, &mat); // will abort on failure.
1434  KALDI_ASSERT(mat.NumCols() >= 2);
1435  int32 input_dim = mat.NumCols() - 1, output_dim = mat.NumRows();
1436  linear_params_.Resize(output_dim, input_dim);
1437  bias_params_.Resize(output_dim);
1438  linear_params_.CopyFromMat(mat.Range(0, output_dim, 0, input_dim));
1439  bias_params_.CopyColFromMat(mat, input_dim);
1440 }
CuVector< BaseFloat > bias_params_
void CopyColFromMat(const CuMatrixBase< Real > &mat, MatrixIndexT col)
Definition: cu-vector.cc:79
void CopyFromMat(const MatrixBase< OtherReal > &src, MatrixTransposeType trans=kNoTrans)
Definition: cu-matrix.cc:337
void Resize(MatrixIndexT dim, MatrixResizeType t=kSetZero)
Allocate the memory.
Definition: cu-vector.cc:892
void ReadKaldiObject(const std::string &filename, Matrix< float > *m)
Definition: kaldi-io.cc:818
void Init(BaseFloat learning_rate)
void Resize(MatrixIndexT rows, MatrixIndexT cols, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Allocate the memory.
Definition: cu-matrix.cc:47
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
CuMatrix< BaseFloat > linear_params_
void InitFromString ( std::string  args)
virtual

Initialize, typically from a line of a config file.

The "args" will contain any parameters that need to be passed to the Component, e.g. dimensions.

Reimplemented from AffineComponent.

Definition at line 1390 of file nnet-component.cc.

References AffineComponentPreconditioned::Init(), AffineComponent::InputDim(), KALDI_ASSERT, KALDI_ERR, UpdatableComponent::learning_rate_, AffineComponent::OutputDim(), and kaldi::nnet2::ParseFromString().

Referenced by kaldi::nnet2::UnitTestAffineComponentPreconditioned().

1390  {
1391  std::string orig_args(args);
1392  std::string matrix_filename;
1393  BaseFloat learning_rate = learning_rate_;
1394  BaseFloat alpha = 0.1, max_change = 0.0;
1395  int32 input_dim = -1, output_dim = -1;
1396  ParseFromString("learning-rate", &args, &learning_rate); // optional.
1397  ParseFromString("alpha", &args, &alpha);
1398  ParseFromString("max-change", &args, &max_change);
1399 
1400  if (ParseFromString("matrix", &args, &matrix_filename)) {
1401  Init(learning_rate, alpha, max_change, matrix_filename);
1402  if (ParseFromString("input-dim", &args, &input_dim))
1403  KALDI_ASSERT(input_dim == InputDim() &&
1404  "input-dim mismatch vs. matrix.");
1405  if (ParseFromString("output-dim", &args, &output_dim))
1406  KALDI_ASSERT(output_dim == OutputDim() &&
1407  "output-dim mismatch vs. matrix.");
1408  } else {
1409  bool ok = true;
1410  ok = ok && ParseFromString("input-dim", &args, &input_dim);
1411  ok = ok && ParseFromString("output-dim", &args, &output_dim);
1412  BaseFloat param_stddev = 1.0 / std::sqrt(input_dim),
1413  bias_stddev = 1.0;
1414  ParseFromString("param-stddev", &args, &param_stddev);
1415  ParseFromString("bias-stddev", &args, &bias_stddev);
1416  if (!ok)
1417  KALDI_ERR << "Bad initializer " << orig_args;
1418  Init(learning_rate, input_dim, output_dim, param_stddev,
1419  bias_stddev, alpha, max_change);
1420  }
1421  if (!args.empty())
1422  KALDI_ERR << "Could not process these elements in initializer: "
1423  << args;
1424 }
virtual int32 OutputDim() const
Get size of output vectors.
virtual int32 InputDim() const
Get size of input vectors.
bool ParseFromString(const std::string &name, std::string *string, int32 *param)
Functions used in Init routines.
float BaseFloat
Definition: kaldi-types.h:29
BaseFloat learning_rate_
learning rate (0.0..0.01)
#define KALDI_ERR
Definition: kaldi-error.h:127
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void Init(BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev, BaseFloat alpha, BaseFloat max_change)
KALDI_DISALLOW_COPY_AND_ASSIGN ( AffineComponentPreconditioned  )
protected
void Read ( std::istream &  is,
bool  binary 
)
virtual

Reimplemented from AffineComponent.

Definition at line 1360 of file nnet-component.cc.

References AffineComponentPreconditioned::alpha_, AffineComponent::bias_params_, kaldi::nnet2::ExpectOneOrTwoTokens(), kaldi::ExpectToken(), KALDI_ASSERT, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, AffineComponentPreconditioned::max_change_, CuVector< Real >::Read(), CuMatrix< Real >::Read(), kaldi::ReadBasicType(), kaldi::ReadToken(), and AffineComponentPreconditioned::Type().

1360  {
1361  std::ostringstream ostr_beg, ostr_end;
1362  ostr_beg << "<" << Type() << ">"; // e.g. "<AffineComponentPreconditioned>"
1363  ostr_end << "</" << Type() << ">"; // e.g. "</AffineComponentPreconditioned>"
1364  // might not see the "<AffineComponentPreconditioned>" part because
1365  // of how ReadNew() works.
1366  ExpectOneOrTwoTokens(is, binary, ostr_beg.str(), "<LearningRate>");
1367  ReadBasicType(is, binary, &learning_rate_);
1368  ExpectToken(is, binary, "<LinearParams>");
1369  linear_params_.Read(is, binary);
1370  ExpectToken(is, binary, "<BiasParams>");
1371  bias_params_.Read(is, binary);
1372  ExpectToken(is, binary, "<Alpha>");
1373  ReadBasicType(is, binary, &alpha_);
1374  // todo: remove back-compat code. Will just be:
1375  // ExpectToken(is, binary, "<MaxChange>");
1376  // ReadBasicType(is, binary, &max_change_);
1377  // ExpectToken(is, binary, ostr_end);
1378  // [end of function]
1379  std::string tok;
1380  ReadToken(is, binary, &tok);
1381  if (tok == "<MaxChange>") {
1382  ReadBasicType(is, binary, &max_change_);
1383  ExpectToken(is, binary, ostr_end.str());
1384  } else {
1385  max_change_ = 0.0;
1386  KALDI_ASSERT(tok == ostr_end.str());
1387  }
1388 }
CuVector< BaseFloat > bias_params_
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void Read(std::istream &is, bool binary)
I/O.
Definition: cu-vector.cc:862
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:188
BaseFloat learning_rate_
learning rate (0.0..0.01)
void Read(std::istream &is, bool binary)
I/O functions.
Definition: cu-matrix.cc:459
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
static void ExpectOneOrTwoTokens(std::istream &is, bool binary, const std::string &token1, const std::string &token2)
CuMatrix< BaseFloat > linear_params_
void SetMaxChange ( BaseFloat  max_change)
inline

Definition at line 965 of file nnet-component.h.

References AffineComponentPreconditioned::max_change_.

Referenced by kaldi::nnet2::SetMaxChange().

965 { max_change_ = max_change; }
virtual std::string Type ( ) const
inlinevirtual

Reimplemented from AffineComponent.

Definition at line 950 of file nnet-component.h.

Referenced by AffineComponentPreconditioned::Info(), AffineComponentPreconditioned::Read(), and AffineComponentPreconditioned::Write().

950 { return "AffineComponentPreconditioned"; }
void Update ( const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv 
)
protectedvirtual

Reimplemented from AffineComponent.

Definition at line 1544 of file nnet-component.cc.

References CuMatrixBase< Real >::AddMatMat(), CuVectorBase< Real >::AddMatVec(), AffineComponentPreconditioned::alpha_, AffineComponent::bias_params_, CuVectorBase< Real >::CopyColFromMat(), AffineComponentPreconditioned::GetScalingFactor(), kaldi::kNoTrans, kaldi::kTrans, kaldi::kUndefined, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, AffineComponentPreconditioned::max_change_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), kaldi::nnet2::PreconditionDirectionsAlphaRescaled(), CuMatrixBase< Real >::Range(), and CuMatrix< Real >::Resize().

1546  {
1547  CuMatrix<BaseFloat> in_value_temp;
1548 
1549  in_value_temp.Resize(in_value.NumRows(),
1550  in_value.NumCols() + 1, kUndefined);
1551  in_value_temp.Range(0, in_value.NumRows(),
1552  0, in_value.NumCols()).CopyFromMat(in_value);
1553 
1554  // Add the 1.0 at the end of each row "in_value_temp"
1555  in_value_temp.Range(0, in_value.NumRows(),
1556  in_value.NumCols(), 1).Set(1.0);
1557 
1558  CuMatrix<BaseFloat> in_value_precon(in_value_temp.NumRows(),
1559  in_value_temp.NumCols(), kUndefined),
1560  out_deriv_precon(out_deriv.NumRows(),
1561  out_deriv.NumCols(), kUndefined);
1562  // each row of in_value_precon will be that same row of
1563  // in_value, but multiplied by the inverse of a Fisher
1564  // matrix that has been estimated from all the other rows,
1565  // smoothed by some appropriate amount times the identity
1566  // matrix (this amount is proportional to \alpha).
1567  PreconditionDirectionsAlphaRescaled(in_value_temp, alpha_, &in_value_precon);
1568  PreconditionDirectionsAlphaRescaled(out_deriv, alpha_, &out_deriv_precon);
1569 
1570  BaseFloat minibatch_scale = 1.0;
1571 
1572  if (max_change_ > 0.0)
1573  minibatch_scale = GetScalingFactor(in_value_precon, out_deriv_precon);
1574 
1575 
1576  CuSubMatrix<BaseFloat> in_value_precon_part(in_value_precon,
1577  0, in_value_precon.NumRows(),
1578  0, in_value_precon.NumCols() - 1);
1579  // this "precon_ones" is what happens to the vector of 1's representing
1580  // offsets, after multiplication by the preconditioner.
1581  CuVector<BaseFloat> precon_ones(in_value_precon.NumRows());
1582 
1583  precon_ones.CopyColFromMat(in_value_precon, in_value_precon.NumCols() - 1);
1584 
1585  BaseFloat local_lrate = minibatch_scale * learning_rate_;
1586  bias_params_.AddMatVec(local_lrate, out_deriv_precon, kTrans,
1587  precon_ones, 1.0);
1588  linear_params_.AddMatMat(local_lrate, out_deriv_precon, kTrans,
1589  in_value_precon_part, kNoTrans, 1.0);
1590 }
CuVector< BaseFloat > bias_params_
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
float BaseFloat
Definition: kaldi-types.h:29
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
BaseFloat learning_rate_
learning rate (0.0..0.01)
void AddMatMat(Real alpha, const CuMatrixBase< Real > &A, MatrixTransposeType transA, const CuMatrixBase< Real > &B, MatrixTransposeType transB, Real beta)
C = alpha * A(^T)*B(^T) + beta * C.
Definition: cu-matrix.cc:1142
BaseFloat GetScalingFactor(const CuMatrix< BaseFloat > &in_value_precon, const CuMatrix< BaseFloat > &out_deriv_precon)
The following function is only called if max_change_ > 0.
void AddMatVec(const Real alpha, const CuMatrixBase< Real > &M, MatrixTransposeType trans, const CuVectorBase< Real > &v, const Real beta)
Definition: cu-vector.cc:439
CuMatrix< BaseFloat > linear_params_
void PreconditionDirectionsAlphaRescaled(const CuMatrixBase< BaseFloat > &R, double alpha, CuMatrixBase< BaseFloat > *P)
This wrapper for PreconditionDirections computes lambda using = /(N D) trace(R^T, R), and calls PreconditionDirections.
void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Reimplemented from AffineComponent.

Definition at line 1463 of file nnet-component.cc.

References AffineComponentPreconditioned::alpha_, AffineComponent::bias_params_, UpdatableComponent::learning_rate_, AffineComponent::linear_params_, AffineComponentPreconditioned::max_change_, AffineComponentPreconditioned::Type(), CuVector< Real >::Write(), CuMatrixBase< Real >::Write(), kaldi::WriteBasicType(), and kaldi::WriteToken().

1463  {
1464  std::ostringstream ostr_beg, ostr_end;
1465  ostr_beg << "<" << Type() << ">"; // e.g. "<AffineComponent>"
1466  ostr_end << "</" << Type() << ">"; // e.g. "</AffineComponent>"
1467  WriteToken(os, binary, ostr_beg.str());
1468  WriteToken(os, binary, "<LearningRate>");
1469  WriteBasicType(os, binary, learning_rate_);
1470  WriteToken(os, binary, "<LinearParams>");
1471  linear_params_.Write(os, binary);
1472  WriteToken(os, binary, "<BiasParams>");
1473  bias_params_.Write(os, binary);
1474  WriteToken(os, binary, "<Alpha>");
1475  WriteBasicType(os, binary, alpha_);
1476  WriteToken(os, binary, "<MaxChange>");
1477  WriteBasicType(os, binary, max_change_);
1478  WriteToken(os, binary, ostr_end.str());
1479 }
CuVector< BaseFloat > bias_params_
void Write(std::ostream &is, bool binary) const
Definition: cu-vector.cc:872
BaseFloat learning_rate_
learning rate (0.0..0.01)
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34
CuMatrix< BaseFloat > linear_params_
void Write(std::ostream &os, bool binary) const
Definition: cu-matrix.cc:467

Member Data Documentation


The documentation for this class was generated from the following files: