All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
MaxPooling2DComponent Class Reference

MaxPoolingComponent : The input/output matrices are split to submatrices with width 'pool_stride_'. More...

#include <nnet-max-pooling-2d-component.h>

Inheritance diagram for MaxPooling2DComponent:
Collaboration diagram for MaxPooling2DComponent:

Public Member Functions

 MaxPooling2DComponent (int32 dim_in, int32 dim_out)
 
 ~MaxPooling2DComponent ()
 
ComponentCopy () const
 Copy component (deep copy),. More...
 
ComponentType GetType () const
 Get Type Identification of the component,. More...
 
void InitData (std::istream &is)
 Virtual interface for initialization and I/O,. More...
 
void ReadData (std::istream &is, bool binary)
 Reads the component content. More...
 
void WriteData (std::ostream &os, bool binary) const
 Writes the component content. More...
 
void PropagateFnc (const CuMatrixBase< BaseFloat > &in, CuMatrixBase< BaseFloat > *out)
 Abstract interface for propagation/backpropagation. More...
 
void BackpropagateFnc (const CuMatrixBase< BaseFloat > &in, const CuMatrixBase< BaseFloat > &out, const CuMatrixBase< BaseFloat > &out_diff, CuMatrixBase< BaseFloat > *in_diff)
 Backward pass transformation (to be implemented by descending class...) More...
 
- Public Member Functions inherited from Component
 Component (int32 input_dim, int32 output_dim)
 Generic interface of a component,. More...
 
virtual ~Component ()
 
virtual bool IsUpdatable () const
 Check if componeny has 'Updatable' interface (trainable components),. More...
 
virtual bool IsMultistream () const
 Check if component has 'Recurrent' interface (trainable and recurrent),. More...
 
int32 InputDim () const
 Get the dimension of the input,. More...
 
int32 OutputDim () const
 Get the dimension of the output,. More...
 
void Propagate (const CuMatrixBase< BaseFloat > &in, CuMatrix< BaseFloat > *out)
 Perform forward-pass propagation 'in' -> 'out',. More...
 
void Backpropagate (const CuMatrixBase< BaseFloat > &in, const CuMatrixBase< BaseFloat > &out, const CuMatrixBase< BaseFloat > &out_diff, CuMatrix< BaseFloat > *in_diff)
 Perform backward-pass propagation 'out_diff' -> 'in_diff'. More...
 
void Write (std::ostream &os, bool binary) const
 Write the component to a stream,. More...
 
virtual std::string Info () const
 Print some additional info (after <ComponentName> and the dims),. More...
 
virtual std::string InfoGradient () const
 Print some additional info about gradient (after <...> and dims),. More...
 

Private Attributes

int32 fmap_x_len_
 
int32 fmap_y_len_
 
int32 pool_x_len_
 
int32 pool_y_len_
 
int32 pool_x_step_
 
int32 pool_y_step_
 

Additional Inherited Members

- Public Types inherited from Component
enum  ComponentType {
  kUnknown = 0x0, kUpdatableComponent = 0x0100, kAffineTransform, kLinearTransform,
  kConvolutionalComponent, kConvolutional2DComponent, kLstmProjected, kBlstmProjected,
  kRecurrentComponent, kActivationFunction = 0x0200, kSoftmax, kHiddenSoftmax,
  kBlockSoftmax, kSigmoid, kTanh, kParametricRelu,
  kDropout, kLengthNormComponent, kTranform = 0x0400, kRbm,
  kSplice, kCopy, kTranspose, kBlockLinearity,
  kAddShift, kRescale, kKlHmm = 0x0800, kSentenceAveragingComponent,
  kSimpleSentenceAveragingComponent, kAveragePoolingComponent, kAveragePooling2DComponent, kMaxPoolingComponent,
  kMaxPooling2DComponent, kFramePoolingComponent, kParallelComponent, kMultiBasisComponent
}
 Component type identification mechanism,. More...
 
- Static Public Member Functions inherited from Component
static const char * TypeToMarker (ComponentType t)
 Converts component type to marker,. More...
 
static ComponentType MarkerToType (const std::string &s)
 Converts marker to component type (case insensitive),. More...
 
static ComponentInit (const std::string &conf_line)
 Initialize component from a line in config file,. More...
 
static ComponentRead (std::istream &is, bool binary)
 Read the component from a stream (static method),. More...
 
- Static Public Attributes inherited from Component
static const struct key_value kMarkerMap []
 The table with pairs of Component types and markers (defined in nnet-component.cc),. More...
 
- Protected Attributes inherited from Component
int32 input_dim_
 Data members,. More...
 
int32 output_dim_
 Dimension of the output of the Component,. More...
 

Detailed Description

MaxPoolingComponent : The input/output matrices are split to submatrices with width 'pool_stride_'.

The pooling is done over 3rd axis, of the set of 2d matrices. Our pooling supports overlaps, overlaps occur when (pool_step_ < pool_size_).

Definition at line 41 of file nnet-max-pooling-2d-component.h.

Constructor & Destructor Documentation

~MaxPooling2DComponent ( )
inline

Definition at line 50 of file nnet-max-pooling-2d-component.h.

51  { }

Member Function Documentation

void BackpropagateFnc ( const CuMatrixBase< BaseFloat > &  in,
const CuMatrixBase< BaseFloat > &  out,
const CuMatrixBase< BaseFloat > &  out_diff,
CuMatrixBase< BaseFloat > *  in_diff 
)
inlinevirtual

Backward pass transformation (to be implemented by descending class...)

Implements Component.

Definition at line 153 of file nnet-max-pooling-2d-component.h.

References CuMatrixBase< Real >::ColRange(), CuMatrixBase< Real >::EqualElementMask(), MaxPooling2DComponent::fmap_x_len_, MaxPooling2DComponent::fmap_y_len_, rnnlm::i, Component::input_dim_, rnnlm::j, KALDI_ASSERT, rnnlm::n, MaxPooling2DComponent::pool_x_len_, MaxPooling2DComponent::pool_x_step_, MaxPooling2DComponent::pool_y_len_, MaxPooling2DComponent::pool_y_step_, and CuMatrixBase< Real >::SetZero().

156  {
157  // useful dims
158  int32 num_input_fmaps = input_dim_ / (fmap_x_len_ * fmap_y_len_);
159  int32 inp_fmap_size = fmap_x_len_ * fmap_y_len_;
160 
161  //
162  // here we note how many diff matrices are summed for each input patch,
163  std::vector<int32> patch_summands(inp_fmap_size, 0);
164  // this metainfo will be used to divide diff of patches
165  // used in more than one pool.
166  //
167 
168  in_diff->SetZero(); // reset
169 
170  int out_fmap_cnt = 0;
171  for (int32 m = 0; m < fmap_x_len_-pool_x_len_+1; m = m+pool_x_step_) {
172  for (int32 n = 0; n < fmap_y_len_-pool_y_len_+1; n = n+pool_y_step_) {
173  int32 st = 0;
174  st = (m*fmap_y_len_+n)*num_input_fmaps;
175 
176  for (int32 i = 0; i < pool_x_len_; i++) {
177  for (int32 j = 0; j < pool_y_len_; j++) {
178  int32 c = 0;
179  c = st + i * (num_input_fmaps * fmap_y_len_)
180  + j * num_input_fmaps;
181  //
182  CuSubMatrix<BaseFloat> in_p(in.ColRange(c, num_input_fmaps));
183  CuSubMatrix<BaseFloat> out_p(
184  out.ColRange(out_fmap_cnt*num_input_fmaps, num_input_fmaps)
185  );
186  //
187 
188  CuSubMatrix<BaseFloat> tgt(in_diff->ColRange(c, num_input_fmaps));
189  CuMatrix<BaseFloat> src(
190  out_diff.ColRange(out_fmap_cnt*num_input_fmaps, num_input_fmaps)
191  );
192 
193  CuMatrix<BaseFloat> mask;
194  in_p.EqualElementMask(out_p, &mask);
195  src.MulElements(mask);
196  tgt.AddMat(1.0, src);
197 
198  patch_summands[c/num_input_fmaps] += 1;
199  }
200  }
201  out_fmap_cnt++;
202  }
203  }
204 
205  // divide diff by #summands (compensate for patches used in more pools),
206  for (int i = 0; i < fmap_x_len_; i++) {
207  for (int32 j = 0; j < fmap_y_len_; j++) {
208  int32 c = i * fmap_y_len_ + j;
209  CuSubMatrix<BaseFloat> tgt(in_diff->ColRange(c * num_input_fmaps, num_input_fmaps));
210  KALDI_ASSERT(patch_summands[c] > 0); // patch at least in one pool
211  tgt.Scale(1.0 / patch_summands[c]);
212  }
213  }
214  }
int32 input_dim_
Data members,.
CuSubMatrix< Real > ColRange(const MatrixIndexT col_offset, const MatrixIndexT num_cols) const
Definition: cu-matrix.h:544
struct rnnlm::@11::@12 n
void SetZero()
Math operations, some calling kernels.
Definition: cu-matrix.cc:474
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
Component* Copy ( ) const
inlinevirtual

Copy component (deep copy),.

Implements Component.

Definition at line 53 of file nnet-max-pooling-2d-component.h.

References MaxPooling2DComponent::MaxPooling2DComponent().

53 { return new MaxPooling2DComponent(*this); }
MaxPooling2DComponent(int32 dim_in, int32 dim_out)
ComponentType GetType ( ) const
inlinevirtual

Get Type Identification of the component,.

Implements Component.

Definition at line 54 of file nnet-max-pooling-2d-component.h.

References Component::kMaxPooling2DComponent.

void InitData ( std::istream &  is)
inlinevirtual

Virtual interface for initialization and I/O,.

Initialize internal data of a component

Reimplemented from Component.

Definition at line 56 of file nnet-max-pooling-2d-component.h.

References MaxPooling2DComponent::fmap_x_len_, MaxPooling2DComponent::fmap_y_len_, KALDI_ASSERT, KALDI_ERR, MaxPooling2DComponent::pool_x_len_, MaxPooling2DComponent::pool_x_step_, MaxPooling2DComponent::pool_y_len_, MaxPooling2DComponent::pool_y_step_, kaldi::ReadBasicType(), and kaldi::ReadToken().

56  {
57  // parse config
58  std::string token;
59  while (is >> std::ws, !is.eof()) {
60  ReadToken(is, false, &token);
61  if (token == "<FmapXLen>") ReadBasicType(is, false, &fmap_x_len_);
62  else if (token == "<FmapYLen>") ReadBasicType(is, false, &fmap_y_len_);
63  else if (token == "<PoolXLen>") ReadBasicType(is, false, &pool_x_len_);
64  else if (token == "<PoolYLen>") ReadBasicType(is, false, &pool_y_len_);
65  else if (token == "<PoolXStep>") ReadBasicType(is, false, &pool_x_step_);
66  else if (token == "<PoolYStep>") ReadBasicType(is, false, &pool_y_step_);
67  else KALDI_ERR << "Unknown token " << token << ", a typo in config?"
68  << " (FmapXLen|FmapYLen|PoolXLen|PoolYLen|PoolXStep|PoolYStep)";
69  }
70  // check
74  }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
#define KALDI_ERR
Definition: kaldi-error.h:127
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void PropagateFnc ( const CuMatrixBase< BaseFloat > &  in,
CuMatrixBase< BaseFloat > *  out 
)
inlinevirtual

Abstract interface for propagation/backpropagation.

Forward pass transformation (to be implemented by descending class...)

Implements Component.

Definition at line 127 of file nnet-max-pooling-2d-component.h.

References CuMatrixBase< Real >::ColRange(), MaxPooling2DComponent::fmap_x_len_, MaxPooling2DComponent::fmap_y_len_, rnnlm::i, Component::input_dim_, rnnlm::j, rnnlm::n, MaxPooling2DComponent::pool_x_len_, MaxPooling2DComponent::pool_x_step_, MaxPooling2DComponent::pool_y_len_, MaxPooling2DComponent::pool_y_step_, and CuMatrixBase< Real >::Set().

128  {
129  // useful dims
130  int32 num_input_fmaps = input_dim_ / (fmap_x_len_ * fmap_y_len_);
131  int out_fmap_cnt = 0;
132  for (int32 m = 0; m < fmap_x_len_-pool_x_len_+1; m = m+pool_x_step_) {
133  for (int32 n = 0; n < fmap_y_len_-pool_y_len_+1; n = n+pool_y_step_) {
134  int32 st = 0;
135  st = (m * fmap_y_len_ + n) * num_input_fmaps;
136  CuSubMatrix<BaseFloat> pool(
137  out->ColRange(out_fmap_cnt * num_input_fmaps, num_input_fmaps)
138  );
139  pool.Set(-1e20); // reset (large neg value)
140  for (int32 i = 0; i < pool_x_len_; i++) {
141  for (int32 j = 0; j < pool_y_len_; j++) {
142  int32 c = 0;
143  c = st + i * (num_input_fmaps * fmap_y_len_)
144  + j * num_input_fmaps;
145  pool.Max(in.ColRange(c, num_input_fmaps));
146  }
147  }
148  out_fmap_cnt++;
149  }
150  }
151  }
int32 input_dim_
Data members,.
CuSubMatrix< Real > ColRange(const MatrixIndexT col_offset, const MatrixIndexT num_cols) const
Definition: cu-matrix.h:544
struct rnnlm::@11::@12 n
void ReadData ( std::istream &  is,
bool  binary 
)
inlinevirtual

Reads the component content.

Reimplemented from Component.

Definition at line 76 of file nnet-max-pooling-2d-component.h.

References kaldi::ExpectToken(), MaxPooling2DComponent::fmap_x_len_, MaxPooling2DComponent::fmap_y_len_, Component::input_dim_, KALDI_ASSERT, KALDI_LOG, Component::output_dim_, MaxPooling2DComponent::pool_x_len_, MaxPooling2DComponent::pool_x_step_, MaxPooling2DComponent::pool_y_len_, MaxPooling2DComponent::pool_y_step_, and kaldi::ReadBasicType().

76  {
77  // pooling hyperparameters
78  ExpectToken(is, binary, "<FmapXLen>");
79  ReadBasicType(is, binary, &fmap_x_len_);
80  ExpectToken(is, binary, "<FmapYLen>");
81  ReadBasicType(is, binary, &fmap_y_len_);
82  ExpectToken(is, binary, "<PoolXLen>");
83  ReadBasicType(is, binary, &pool_x_len_);
84  ExpectToken(is, binary, "<PoolYLen>");
85  ReadBasicType(is, binary, &pool_y_len_);
86  ExpectToken(is, binary, "<PoolXStep>");
87  ReadBasicType(is, binary, &pool_x_step_);
88  ExpectToken(is, binary, "<PoolYStep>");
89  ReadBasicType(is, binary, &pool_y_step_);
90 
91  //
92  // Sanity checks:
93  //
94  // input sanity checks
95  // input_dim_ should be multiple of (fmap_x_len_ * fmap_y_len_)
97  int32 num_input_fmaps = input_dim_ / (fmap_x_len_ * fmap_y_len_);
98  KALDI_LOG << "num_fmaps " << num_input_fmaps;
99  // check if step is in sync with fmap_len and filt_len
102  int32 out_fmap_x_len = (fmap_x_len_ - pool_x_len_)/pool_x_step_ + 1;
103  int32 out_fmap_y_len = (fmap_y_len_ - pool_y_len_)/pool_y_step_ + 1;
104  // int32 out_fmap_size = out_fmap_x_len*out_fmap_y_len;
105  // output sanity checks
106  KALDI_ASSERT(output_dim_ % (out_fmap_x_len * out_fmap_y_len) == 0);
107  int32 num_output_fmaps = output_dim_ / (out_fmap_x_len * out_fmap_y_len);
108  KALDI_ASSERT(num_input_fmaps == num_output_fmaps);
109  }
int32 input_dim_
Data members,.
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:188
int32 output_dim_
Dimension of the output of the Component,.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
#define KALDI_LOG
Definition: kaldi-error.h:133
void WriteData ( std::ostream &  os,
bool  binary 
) const
inlinevirtual

Writes the component content.

Reimplemented from Component.

Definition at line 111 of file nnet-max-pooling-2d-component.h.

References MaxPooling2DComponent::fmap_x_len_, MaxPooling2DComponent::fmap_y_len_, MaxPooling2DComponent::pool_x_len_, MaxPooling2DComponent::pool_x_step_, MaxPooling2DComponent::pool_y_len_, MaxPooling2DComponent::pool_y_step_, kaldi::WriteBasicType(), and kaldi::WriteToken().

111  {
112  // pooling hyperparameters
113  WriteToken(os, binary, "<FmapXLen>");
114  WriteBasicType(os, binary, fmap_x_len_);
115  WriteToken(os, binary, "<FmapYLen>");
116  WriteBasicType(os, binary, fmap_y_len_);
117  WriteToken(os, binary, "<PoolXLen>");
118  WriteBasicType(os, binary, pool_x_len_);
119  WriteToken(os, binary, "<PoolYLen>");
120  WriteBasicType(os, binary, pool_y_len_);
121  WriteToken(os, binary, "<PoolXStep>");
122  WriteBasicType(os, binary, pool_x_step_);
123  WriteToken(os, binary, "<PoolYStep>");
124  WriteBasicType(os, binary, pool_y_step_);
125  }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation


The documentation for this class was generated from the following file: