All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
BlockAffineComponentPreconditioned Class Reference

#include <nnet-component.h>

Inheritance diagram for BlockAffineComponentPreconditioned:
Collaboration diagram for BlockAffineComponentPreconditioned:

Public Member Functions

void Init (BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev, int32 num_blocks, BaseFloat alpha)
 
virtual void InitFromString (std::string args)
 Initialize, typically from a line of a config file. More...
 
 BlockAffineComponentPreconditioned ()
 
virtual std::string Type () const
 
virtual void SetZero (bool treat_as_gradient)
 Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set the learning rate to 1 and make any other changes necessary (there's a variable we have to set for the MixtureProbComponent). More...
 
virtual void Read (std::istream &is, bool binary)
 
virtual void Write (std::ostream &os, bool binary) const
 Write component to stream. More...
 
virtual ComponentCopy () const
 Copy component (deep copy). More...
 
- Public Member Functions inherited from BlockAffineComponent
virtual int32 InputDim () const
 Get size of input vectors. More...
 
virtual int32 OutputDim () const
 Get size of output vectors. More...
 
virtual int32 GetParameterDim () const
 The following new virtual function returns the total dimension of the parameters in this class. More...
 
virtual void Vectorize (VectorBase< BaseFloat > *params) const
 Turns the parameters into vector form. More...
 
virtual void UnVectorize (const VectorBase< BaseFloat > &params)
 Converts the parameters from vector form. More...
 
void Init (BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev, int32 num_blocks)
 
 BlockAffineComponent ()
 
virtual bool BackpropNeedsInput () const
 
virtual bool BackpropNeedsOutput () const
 
virtual void Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const
 Perform forward pass propagation Input->Output. More...
 
virtual void Backprop (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, Component *to_update, CuMatrix< BaseFloat > *in_deriv) const
 Perform backward pass propagation of the derivative, and also either update the model (if to_update == this) or update another model or compute the model derivative (otherwise). More...
 
virtual BaseFloat DotProduct (const UpdatableComponent &other) const
 Here, "other" is a component of the same specific type. More...
 
virtual void PerturbParams (BaseFloat stddev)
 We introduce a new virtual function that only applies to class UpdatableComponent. More...
 
virtual void Scale (BaseFloat scale)
 This new virtual function scales the parameters by this amount. More...
 
virtual void Add (BaseFloat alpha, const UpdatableComponent &other)
 This new virtual function adds the parameters of another updatable component, times some constant, to the current parameters. More...
 
- Public Member Functions inherited from UpdatableComponent
 UpdatableComponent (const UpdatableComponent &other)
 
void Init (BaseFloat learning_rate)
 
 UpdatableComponent (BaseFloat learning_rate)
 
 UpdatableComponent ()
 
virtual ~UpdatableComponent ()
 
void SetLearningRate (BaseFloat lrate)
 Sets the learning rate of gradient descent. More...
 
BaseFloat LearningRate () const
 Gets the learning rate of gradient descent. More...
 
virtual std::string Info () const
 
- Public Member Functions inherited from Component
 Component ()
 
virtual int32 Index () const
 Returns the index in the sequence of layers in the neural net; intended only to be used in debugging information. More...
 
virtual void SetIndex (int32 index)
 
virtual std::vector< int32 > Context () const
 Return a vector describing the temporal context this component requires for each frame of output, as a sorted list. More...
 
void Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out) const
 A non-virtual propagate function that first resizes output if necessary. More...
 
virtual ~Component ()
 

Private Member Functions

 KALDI_DISALLOW_COPY_AND_ASSIGN (BlockAffineComponentPreconditioned)
 
virtual void Update (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 

Private Attributes

bool is_gradient_
 
BaseFloat alpha_
 

Additional Inherited Members

- Static Public Member Functions inherited from Component
static ComponentReadNew (std::istream &is, bool binary)
 Read component from stream. More...
 
static ComponentNewFromString (const std::string &initializer_line)
 Initialize the Component from one line that will contain first the type, e.g. More...
 
static ComponentNewComponentOfType (const std::string &type)
 Return a new Component of the given type e.g. More...
 
- Protected Member Functions inherited from BlockAffineComponent
virtual void UpdateSimple (const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
 
- Protected Attributes inherited from BlockAffineComponent
CuMatrix< BaseFloatlinear_params_
 
CuVector< BaseFloatbias_params_
 
int32 num_blocks_
 
- Protected Attributes inherited from UpdatableComponent
BaseFloat learning_rate_
 learning rate (0.0..0.01) More...
 

Detailed Description

Definition at line 1242 of file nnet-component.h.

Constructor & Destructor Documentation

Definition at line 1252 of file nnet-component.h.

Referenced by BlockAffineComponentPreconditioned::Copy().

1252 { } // use Init to really initialize.

Member Function Documentation

Component * Copy ( ) const
virtual

Copy component (deep copy).

Reimplemented from BlockAffineComponent.

Definition at line 2241 of file nnet-component.cc.

References BlockAffineComponentPreconditioned::alpha_, BlockAffineComponent::bias_params_, BlockAffineComponentPreconditioned::BlockAffineComponentPreconditioned(), BlockAffineComponentPreconditioned::is_gradient_, UpdatableComponent::learning_rate_, BlockAffineComponent::linear_params_, and BlockAffineComponent::num_blocks_.

2241  {
2244  ans->learning_rate_ = learning_rate_;
2245  ans->linear_params_ = linear_params_;
2246  ans->bias_params_ = bias_params_;
2247  ans->num_blocks_ = num_blocks_;
2248  ans->alpha_ = alpha_;
2249  ans->is_gradient_ = is_gradient_;
2250  return ans;
2251 }
BaseFloat learning_rate_
learning rate (0.0..0.01)
CuVector< BaseFloat > bias_params_
CuMatrix< BaseFloat > linear_params_
void Init ( BaseFloat  learning_rate,
int32  input_dim,
int32  output_dim,
BaseFloat  param_stddev,
BaseFloat  bias_stddev,
int32  num_blocks,
BaseFloat  alpha 
)

Definition at line 2161 of file nnet-component.cc.

References BlockAffineComponentPreconditioned::alpha_, BlockAffineComponent::Init(), BlockAffineComponentPreconditioned::is_gradient_, and KALDI_ASSERT.

Referenced by BlockAffineComponentPreconditioned::InitFromString(), and kaldi::nnet2::UnitTestBlockAffineComponentPreconditioned().

2166  {
2167  BlockAffineComponent::Init(learning_rate, input_dim, output_dim,
2168  param_stddev, bias_stddev, num_blocks);
2169  is_gradient_ = false;
2170  KALDI_ASSERT(alpha > 0.0);
2171  alpha_ = alpha;
2172 }
void Init(BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev, int32 num_blocks)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void InitFromString ( std::string  args)
virtual

Initialize, typically from a line of a config file.

The "args" will contain any parameters that need to be passed to the Component, e.g. dimensions.

Reimplemented from BlockAffineComponent.

Definition at line 2174 of file nnet-component.cc.

References BlockAffineComponentPreconditioned::Init(), KALDI_ERR, UpdatableComponent::learning_rate_, and kaldi::nnet2::ParseFromString().

Referenced by kaldi::nnet2::UnitTestBlockAffineComponentPreconditioned().

2174  {
2175  std::string orig_args(args);
2176  bool ok = true;
2177  BaseFloat learning_rate = learning_rate_;
2178  BaseFloat alpha = 4.0;
2179  int32 input_dim = -1, output_dim = -1, num_blocks = 1;
2180  ParseFromString("learning-rate", &args, &learning_rate); // optional.
2181  ParseFromString("alpha", &args, &alpha);
2182  ok = ok && ParseFromString("input-dim", &args, &input_dim);
2183  ok = ok && ParseFromString("output-dim", &args, &output_dim);
2184  ok = ok && ParseFromString("num-blocks", &args, &num_blocks);
2185 
2186  BaseFloat param_stddev = 1.0 / std::sqrt(input_dim),
2187  bias_stddev = 1.0;
2188  ParseFromString("param-stddev", &args, &param_stddev);
2189  ParseFromString("bias-stddev", &args, &bias_stddev);
2190  if (!args.empty())
2191  KALDI_ERR << "Could not process these elements in initializer: "
2192  << args;
2193  if (!ok)
2194  KALDI_ERR << "Bad initializer " << orig_args;
2195  Init(learning_rate, input_dim, output_dim,
2196  param_stddev, bias_stddev, num_blocks,
2197  alpha);
2198 }
void Init(BaseFloat learning_rate, int32 input_dim, int32 output_dim, BaseFloat param_stddev, BaseFloat bias_stddev, int32 num_blocks, BaseFloat alpha)
bool ParseFromString(const std::string &name, std::string *string, int32 *param)
Functions used in Init routines.
float BaseFloat
Definition: kaldi-types.h:29
BaseFloat learning_rate_
learning rate (0.0..0.01)
#define KALDI_ERR
Definition: kaldi-error.h:127
KALDI_DISALLOW_COPY_AND_ASSIGN ( BlockAffineComponentPreconditioned  )
private
void Read ( std::istream &  is,
bool  binary 
)
virtual

Reimplemented from BlockAffineComponent.

Definition at line 2206 of file nnet-component.cc.

References BlockAffineComponentPreconditioned::alpha_, BlockAffineComponent::bias_params_, kaldi::nnet2::ExpectOneOrTwoTokens(), kaldi::ExpectToken(), BlockAffineComponentPreconditioned::is_gradient_, UpdatableComponent::learning_rate_, BlockAffineComponent::linear_params_, BlockAffineComponent::num_blocks_, CuVector< Real >::Read(), CuMatrix< Real >::Read(), and kaldi::ReadBasicType().

2206  {
2207  ExpectOneOrTwoTokens(is, binary, "<BlockAffineComponentPreconditioned>",
2208  "<LearningRate>");
2209  ReadBasicType(is, binary, &learning_rate_);
2210  ExpectToken(is, binary, "<NumBlocks>");
2211  ReadBasicType(is, binary, &num_blocks_);
2212  ExpectToken(is, binary, "<LinearParams>");
2213  linear_params_.Read(is, binary);
2214  ExpectToken(is, binary, "<BiasParams>");
2215  bias_params_.Read(is, binary);
2216  ExpectToken(is, binary, "<Alpha>");
2217  ReadBasicType(is, binary, &alpha_);
2218  ExpectToken(is, binary, "<IsGradient>");
2219  ReadBasicType(is, binary, &is_gradient_);
2220  ExpectToken(is, binary, "</BlockAffineComponentPreconditioned>");
2221 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void Read(std::istream &is, bool binary)
I/O.
Definition: cu-vector.cc:862
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:188
BaseFloat learning_rate_
learning rate (0.0..0.01)
void Read(std::istream &is, bool binary)
I/O functions.
Definition: cu-matrix.cc:459
static void ExpectOneOrTwoTokens(std::istream &is, bool binary, const std::string &token1, const std::string &token2)
CuVector< BaseFloat > bias_params_
CuMatrix< BaseFloat > linear_params_
void SetZero ( bool  treat_as_gradient)
virtual

Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set the learning rate to 1 and make any other changes necessary (there's a variable we have to set for the MixtureProbComponent).

Reimplemented from BlockAffineComponent.

Definition at line 2200 of file nnet-component.cc.

References BlockAffineComponentPreconditioned::is_gradient_, and BlockAffineComponent::SetZero().

2200  {
2201  if (treat_as_gradient)
2202  is_gradient_ = true;
2203  BlockAffineComponent::SetZero(treat_as_gradient);
2204 }
virtual void SetZero(bool treat_as_gradient)
Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set...
virtual std::string Type ( ) const
inlinevirtual

Reimplemented from BlockAffineComponent.

Definition at line 1253 of file nnet-component.h.

1253 { return "BlockAffineComponentPreconditioned"; }
void Update ( const CuMatrixBase< BaseFloat > &  in_value,
const CuMatrixBase< BaseFloat > &  out_deriv 
)
privatevirtual

Reimplemented from BlockAffineComponent.

Definition at line 2253 of file nnet-component.cc.

References CuMatrixBase< Real >::AddMatMat(), BlockAffineComponentPreconditioned::alpha_, BlockAffineComponent::bias_params_, CuVectorBase< Real >::CopyColFromMat(), CuMatrixBase< Real >::CopyFromMat(), BlockAffineComponentPreconditioned::is_gradient_, kaldi::kNoTrans, kaldi::kTrans, kaldi::kUndefined, UpdatableComponent::learning_rate_, BlockAffineComponent::linear_params_, BlockAffineComponent::num_blocks_, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), kaldi::nnet2::PreconditionDirectionsAlphaRescaled(), CuVectorBase< Real >::Range(), and BlockAffineComponent::UpdateSimple().

2255  {
2256  if (is_gradient_) {
2257  UpdateSimple(in_value, out_deriv);
2258  // does the baseline update with no preconditioning.
2259  return;
2260  }
2261  int32 input_block_dim = linear_params_.NumCols(),
2262  output_block_dim = linear_params_.NumRows() / num_blocks_,
2263  num_frames = in_value.NumRows();
2264 
2265  CuMatrix<BaseFloat> in_value_temp(num_frames, input_block_dim + 1, kUndefined),
2266  in_value_precon(num_frames, input_block_dim + 1, kUndefined);
2267  in_value_temp.Set(1.0); // so last row will have value 1.0.
2268  CuSubMatrix<BaseFloat> in_value_temp_part(in_value_temp, 0, num_frames,
2269  0, input_block_dim); // all but last 1.0
2270  CuSubMatrix<BaseFloat> in_value_precon_part(in_value_precon, 0, num_frames,
2271  0, input_block_dim);
2272  CuVector<BaseFloat> precon_ones(num_frames);
2273  CuMatrix<BaseFloat> out_deriv_precon(num_frames, output_block_dim, kUndefined);
2274 
2275  for (int32 b = 0; b < num_blocks_; b++) {
2276  CuSubMatrix<BaseFloat> in_value_block(in_value, 0, num_frames,
2277  b * input_block_dim,
2278  input_block_dim),
2279  out_deriv_block(out_deriv, 0, num_frames,
2280  b * output_block_dim, output_block_dim),
2281  param_block(linear_params_,
2282  b * output_block_dim, output_block_dim,
2283  0, input_block_dim);
2284  in_value_temp_part.CopyFromMat(in_value_block);
2285 
2287  &in_value_precon);
2289  &out_deriv_precon);
2290 
2291 
2292  // Update the parameters.
2293  param_block.AddMatMat(learning_rate_, out_deriv_precon, kTrans,
2294  in_value_precon_part, kNoTrans, 1.0);
2295  precon_ones.CopyColFromMat(in_value_precon, input_block_dim);
2296  bias_params_.Range(b * output_block_dim, output_block_dim).
2297  AddMatVec(learning_rate_, out_deriv_precon, kTrans,
2298  precon_ones, 1.0);
2299  }
2300 }
virtual void UpdateSimple(const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_deriv)
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
BaseFloat learning_rate_
learning rate (0.0..0.01)
CuSubVector< Real > Range(const MatrixIndexT o, const MatrixIndexT l)
Definition: cu-vector.h:132
CuVector< BaseFloat > bias_params_
CuMatrix< BaseFloat > linear_params_
void PreconditionDirectionsAlphaRescaled(const CuMatrixBase< BaseFloat > &R, double alpha, CuMatrixBase< BaseFloat > *P)
This wrapper for PreconditionDirections computes lambda using = /(N D) trace(R^T, R), and calls PreconditionDirections.
void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write component to stream.

Reimplemented from BlockAffineComponent.

Definition at line 2223 of file nnet-component.cc.

References BlockAffineComponentPreconditioned::alpha_, BlockAffineComponent::bias_params_, BlockAffineComponentPreconditioned::is_gradient_, UpdatableComponent::learning_rate_, BlockAffineComponent::linear_params_, BlockAffineComponent::num_blocks_, CuVector< Real >::Write(), CuMatrixBase< Real >::Write(), kaldi::WriteBasicType(), and kaldi::WriteToken().

2224  {
2225  WriteToken(os, binary, "<BlockAffineComponentPreconditioned>");
2226  WriteToken(os, binary, "<LearningRate>");
2227  WriteBasicType(os, binary, learning_rate_);
2228  WriteToken(os, binary, "<NumBlocks>");
2229  WriteBasicType(os, binary, num_blocks_);
2230  WriteToken(os, binary, "<LinearParams>");
2231  linear_params_.Write(os, binary);
2232  WriteToken(os, binary, "<BiasParams>");
2233  bias_params_.Write(os, binary);
2234  WriteToken(os, binary, "<Alpha>");
2235  WriteBasicType(os, binary, alpha_);
2236  WriteToken(os, binary, "<IsGradient>");
2237  WriteBasicType(os, binary, is_gradient_);
2238  WriteToken(os, binary, "</BlockAffineComponentPreconditioned>");
2239 }
void Write(std::ostream &is, bool binary) const
Definition: cu-vector.cc:872
BaseFloat learning_rate_
learning rate (0.0..0.01)
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34
CuVector< BaseFloat > bias_params_
CuMatrix< BaseFloat > linear_params_
void Write(std::ostream &os, bool binary) const
Definition: cu-matrix.cc:467

Member Data Documentation


The documentation for this class was generated from the following files: