Class UpdatableComponent is a Component which has trainable parameters and contains some global parameters for stochastic gradient descent (learning rate, L2 regularization constant). More...
#include <nnet-component.h>
Public Member Functions | |
UpdatableComponent (const UpdatableComponent &other) | |
void | Init (BaseFloat learning_rate) |
UpdatableComponent (BaseFloat learning_rate) | |
virtual void | SetZero (bool treat_as_gradient)=0 |
Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set the learning rate to 1 and make any other changes necessary (there's a variable we have to set for the MixtureProbComponent). More... | |
UpdatableComponent () | |
virtual | ~UpdatableComponent () |
virtual BaseFloat | DotProduct (const UpdatableComponent &other) const =0 |
Here, "other" is a component of the same specific type. More... | |
virtual void | PerturbParams (BaseFloat stddev)=0 |
We introduce a new virtual function that only applies to class UpdatableComponent. More... | |
virtual void | Scale (BaseFloat scale)=0 |
This new virtual function scales the parameters by this amount. More... | |
virtual void | Add (BaseFloat alpha, const UpdatableComponent &other)=0 |
This new virtual function adds the parameters of another updatable component, times some constant, to the current parameters. More... | |
void | SetLearningRate (BaseFloat lrate) |
Sets the learning rate of gradient descent. More... | |
BaseFloat | LearningRate () const |
Gets the learning rate of gradient descent. More... | |
virtual std::string | Info () const |
virtual int32 | GetParameterDim () const |
The following new virtual function returns the total dimension of the parameters in this class. More... | |
virtual void | Vectorize (VectorBase< BaseFloat > *params) const |
Turns the parameters into vector form. More... | |
virtual void | UnVectorize (const VectorBase< BaseFloat > ¶ms) |
Converts the parameters from vector form. More... | |
Public Member Functions inherited from Component | |
Component () | |
virtual std::string | Type () const =0 |
virtual int32 | Index () const |
Returns the index in the sequence of layers in the neural net; intended only to be used in debugging information. More... | |
virtual void | SetIndex (int32 index) |
virtual void | InitFromString (std::string args)=0 |
Initialize, typically from a line of a config file. More... | |
virtual int32 | InputDim () const =0 |
Get size of input vectors. More... | |
virtual int32 | OutputDim () const =0 |
Get size of output vectors. More... | |
virtual std::vector< int32 > | Context () const |
Return a vector describing the temporal context this component requires for each frame of output, as a sorted list. More... | |
virtual void | Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out) const =0 |
Perform forward pass propagation Input->Output. More... | |
void | Propagate (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out) const |
A non-virtual propagate function that first resizes output if necessary. More... | |
virtual void | Backprop (const ChunkInfo &in_info, const ChunkInfo &out_info, const CuMatrixBase< BaseFloat > &in_value, const CuMatrixBase< BaseFloat > &out_value, const CuMatrixBase< BaseFloat > &out_deriv, Component *to_update, CuMatrix< BaseFloat > *in_deriv) const =0 |
Perform backward pass propagation of the derivative, and also either update the model (if to_update == this) or update another model or compute the model derivative (otherwise). More... | |
virtual bool | BackpropNeedsInput () const |
virtual bool | BackpropNeedsOutput () const |
virtual Component * | Copy () const =0 |
Copy component (deep copy). More... | |
virtual void | Read (std::istream &is, bool binary)=0 |
virtual void | Write (std::ostream &os, bool binary) const =0 |
Write component to stream. More... | |
virtual | ~Component () |
Protected Attributes | |
BaseFloat | learning_rate_ |
learning rate (0.0..0.01) More... | |
Private Member Functions | |
const UpdatableComponent & | operator= (const UpdatableComponent &other) |
Additional Inherited Members | |
Static Public Member Functions inherited from Component | |
static Component * | ReadNew (std::istream &is, bool binary) |
Read component from stream. More... | |
static Component * | NewFromString (const std::string &initializer_line) |
Initialize the Component from one line that will contain first the type, e.g. More... | |
static Component * | NewComponentOfType (const std::string &type) |
Return a new Component of the given type e.g. More... | |
Class UpdatableComponent is a Component which has trainable parameters and contains some global parameters for stochastic gradient descent (learning rate, L2 regularization constant).
This is a base-class for Components with parameters.
Definition at line 279 of file nnet-component.h.
|
inline |
Definition at line 281 of file nnet-component.h.
|
inline |
Definition at line 287 of file nnet-component.h.
|
inline |
Definition at line 297 of file nnet-component.h.
|
inlinevirtual |
Definition at line 299 of file nnet-component.h.
References kaldi::nnet3::DotProduct(), and kaldi::nnet3::PerturbParams().
|
pure virtual |
This new virtual function adds the parameters of another updatable component, times some constant, to the current parameters.
Implemented in Convolutional1dComponent, BlockAffineComponent, and AffineComponent.
Referenced by Nnet::AddNnet(), and main().
|
pure virtual |
Here, "other" is a component of the same specific type.
This function computes the dot product in parameters, and is computed while automatically adjusting learning rates; typically, one of the two will actually contain the gradient.
Implemented in Convolutional1dComponent, BlockAffineComponent, and AffineComponent.
Referenced by Nnet::ComponentDotProducts(), kaldi::nnet2::ComputeObjfAndGradient(), FastNnetCombiner::ComputeObjfAndGradient(), FisherComputationClass::operator()(), and kaldi::nnet2::UnitTestGenericComponentInternal().
|
inlinevirtual |
The following new virtual function returns the total dimension of the parameters in this class.
E.g. used for L-BFGS update
Reimplemented in Convolutional1dComponent, BlockAffineComponent, and AffineComponent.
Definition at line 333 of file nnet-component.h.
References KALDI_ASSERT.
Referenced by Nnet::GetParameterDim(), main(), Nnet::UnVectorize(), and Nnet::Vectorize().
|
virtual |
Reimplemented from Component.
Reimplemented in Convolutional1dComponent, AffineComponentPreconditionedOnline, AffineComponentPreconditioned, and AffineComponent.
Definition at line 312 of file nnet-component.cc.
References Component::InputDim(), Component::OutputDim(), and Component::Type().
|
inline |
Definition at line 284 of file nnet-component.h.
Referenced by AffineComponent::Init(), AffineComponentPreconditioned::Init(), AffineComponentPreconditionedOnline::Init(), BlockAffineComponent::Init(), and Convolutional1dComponent::Init().
|
inline |
Gets the learning rate of gradient descent.
Definition at line 323 of file nnet-component.h.
Referenced by kaldi::nnet2::CombineNnetsA(), Nnet::GetLearningRates(), AffineComponent::Info(), AffineComponentPreconditioned::Info(), AffineComponentPreconditionedOnline::Info(), Convolutional1dComponent::Info(), and Nnet::ScaleLearningRates().
|
private |
|
pure virtual |
We introduce a new virtual function that only applies to class UpdatableComponent.
This is used in testing.
Implemented in Convolutional1dComponent, BlockAffineComponent, and AffineComponent.
Referenced by main(), and kaldi::nnet2::UnitTestGenericComponentInternal().
|
pure virtual |
This new virtual function scales the parameters by this amount.
Implemented in Convolutional1dComponent, BlockAffineComponent, and AffineComponent.
Referenced by Nnet::AddNnet(), main(), NnetRescaler::RescaleComponent(), Nnet::Scale(), and Nnet::ScaleComponents().
|
inline |
Sets the learning rate of gradient descent.
Definition at line 321 of file nnet-component.h.
Referenced by kaldi::nnet2::CombineNnetsA(), Nnet::ScaleLearningRates(), Nnet::SetLearningRates(), AffineComponent::SetZero(), BlockAffineComponent::SetZero(), and Convolutional1dComponent::SetZero().
|
pure virtual |
Set parameters to zero, and if treat_as_gradient is true, we'll be treating this as a gradient so set the learning rate to 1 and make any other changes necessary (there's a variable we have to set for the MixtureProbComponent).
Implemented in Convolutional1dComponent, BlockAffineComponentPreconditioned, BlockAffineComponent, and AffineComponent.
Referenced by main(), and Nnet::SetZero().
|
inlinevirtual |
Converts the parameters from vector form.
Reimplemented in BlockAffineComponent, and AffineComponent.
Definition at line 340 of file nnet-component.h.
References KALDI_ASSERT.
Referenced by Nnet::UnVectorize().
|
inlinevirtual |
Turns the parameters into vector form.
We put the vector form on the CPU, because in the kinds of situations where we do this, we'll tend to use too much memory for the GPU.
Reimplemented in BlockAffineComponent, and AffineComponent.
Definition at line 338 of file nnet-component.h.
References KALDI_ASSERT.
Referenced by main(), and Nnet::Vectorize().
|
protected |
learning rate (0.0..0.01)
Definition at line 345 of file nnet-component.h.
Referenced by AffineComponentPreconditionedOnline::AffineComponentPreconditionedOnline(), AffineComponent::Copy(), AffineComponentPreconditioned::Copy(), AffineComponentPreconditionedOnline::Copy(), BlockAffineComponent::Copy(), BlockAffineComponentPreconditioned::Copy(), Convolutional1dComponent::Copy(), AffineComponentPreconditioned::GetScalingFactor(), AffineComponentPreconditionedOnline::GetScalingFactor(), AffineComponent::InitFromString(), AffineComponentPreconditioned::InitFromString(), AffineComponentPreconditionedOnline::InitFromString(), BlockAffineComponent::InitFromString(), BlockAffineComponentPreconditioned::InitFromString(), Convolutional1dComponent::InitFromString(), AffineComponent::Read(), AffineComponentPreconditioned::Read(), AffineComponentPreconditionedOnline::Read(), BlockAffineComponent::Read(), BlockAffineComponentPreconditioned::Read(), Convolutional1dComponent::Read(), AffineComponentPreconditioned::Update(), AffineComponentPreconditionedOnline::Update(), BlockAffineComponentPreconditioned::Update(), Convolutional1dComponent::Update(), AffineComponent::UpdateSimple(), BlockAffineComponent::UpdateSimple(), AffineComponent::Write(), AffineComponentPreconditioned::Write(), AffineComponentPreconditionedOnline::Write(), BlockAffineComponent::Write(), BlockAffineComponentPreconditioned::Write(), and Convolutional1dComponent::Write().