All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
ConvolutionComputation Struct Reference

This struct represents the structure of a convolution computation. More...

#include <convolution.h>

Collaboration diagram for ConvolutionComputation:

Classes

struct  ConvolutionStep
 

Public Member Functions

void Write (std::ostream &os, bool binary) const
 
void Read (std::istream &is, bool binary)
 
void ComputeDerived ()
 
void Check () const
 

Public Attributes

int32 num_filters_in
 
int32 num_filters_out
 
int32 height_in
 
int32 height_out
 
int32 num_t_in
 
int32 num_t_out
 
int32 num_images
 
int32 temp_rows
 
int32 temp_cols
 
std::vector< ConvolutionStepsteps
 

Detailed Description

This struct represents the structure of a convolution computation.

This is used inside the PrecomputedIndexes object for the TimeHeightConvolutionComponent (it depends on the inputs and outputs as well as the layer).

CAUTION*: this is after certain transformations of the problem, so the height_in may not always be the "real" height of the input image (it may be a multiple thereof), and the num_t_in may not always be the "real" number of distinct time-steps on the input of the computation (it may be a divisor thereof). ConvolutionComputation contains the info needed to actually perform the computation.

Definition at line 252 of file convolution.h.

Member Function Documentation

void Check ( ) const

Definition at line 349 of file convolution.cc.

References ConvolutionComputation::ConvolutionStep::backward_columns, ConvolutionComputation::ConvolutionStep::columns, ConvolutionComputation::ConvolutionStep::columns_are_contiguous, CuArray< T >::CopyToVec(), CuArray< T >::Dim(), ConvolutionComputation::ConvolutionStep::first_column, ConvolutionComputation::height_in, ConvolutionComputation::ConvolutionStep::height_map, ConvolutionComputation::height_out, rnnlm::i, ConvolutionComputation::ConvolutionStep::input_time_shift, KALDI_ASSERT, ConvolutionComputation::num_filters_in, ConvolutionComputation::num_filters_out, ConvolutionComputation::num_images, ConvolutionComputation::num_t_in, ConvolutionComputation::num_t_out, ConvolutionComputation::ConvolutionStep::params_start_col, ConvolutionComputation::steps, ConvolutionComputation::temp_cols, and ConvolutionComputation::temp_rows.

Referenced by ConvolutionComputation::Read(), and kaldi::nnet3::time_height_convolution::UnPadModelHeight().

349  {
351  height_in > 0 && height_out > 0);
353  num_t_out > 0 && num_images > 0);
354  KALDI_ASSERT((temp_rows == 0 && temp_cols == 0) ||
356  temp_cols > 0));
358  bool temp_mat_required = false;
359  int32 num_steps = steps.size();
360  int32 num_extra_input_times = num_t_in - num_t_out,
361  input_cols = num_filters_in * height_in,
362  smallest_time_shift = 1000,
363  largest_time_shift = 0;
364  // check 'steps'
365  for (int32 s = 0; s < num_steps; s++) {
366  const ConvolutionStep &step = steps[s];
367  KALDI_ASSERT(step.input_time_shift >= 0 &&
368  step.input_time_shift <= num_extra_input_times);
369  if (step.input_time_shift < smallest_time_shift)
370  smallest_time_shift = step.input_time_shift;
371  if (step.input_time_shift > largest_time_shift)
372  largest_time_shift = step.input_time_shift;
373  KALDI_ASSERT(step.params_start_col >= 0 &&
374  step.params_start_col % num_filters_in == 0);
375  if (s != 0) {
376  KALDI_ASSERT(step.input_time_shift != steps[s-1].input_time_shift);
377  }
378  std::vector<int32> columns;
379  step.columns.CopyToVec(&columns);
380  KALDI_ASSERT(step.first_column == columns[0]);
381  KALDI_ASSERT(step.columns.Dim() == step.height_map.size() * num_filters_in);
382  bool all_negative = true;
383  int32 temp_height = step.height_map.size();
384  bool contiguous = true;
385  for (int32 i = 0; i < temp_height; i++) {
386  int32 h = step.height_map[i];
387  KALDI_ASSERT(h >= -1 && h < height_in);
388  if (i > 0 && step.height_map[i-1] != h-1)
389  contiguous = false;
390  if (h == -1) {
391  contiguous = false;
392  for (int32 f = 0; f < num_filters_in; f++) {
393  KALDI_ASSERT(columns[i * num_filters_in + f] == -1);
394  }
395  } else {
396  all_negative = false;
397  for (int32 f = 0; f < num_filters_in; f++) {
398  KALDI_ASSERT(columns[i * num_filters_in + f] ==
399  h * num_filters_in + f);
400  }
401  }
402  }
403  KALDI_ASSERT(contiguous == step.columns_are_contiguous);
404  if (!contiguous || columns.size() != input_cols) {
405  // we would need the temporary matrix. Make sure the
406  // temporary matrix is big enough.
407  temp_mat_required = true;
408  KALDI_ASSERT(columns.size() <= temp_cols);
409  }
410  KALDI_ASSERT(!all_negative);
411 
412  std::vector<int32> columns_reconstructed(columns.size(), -1);
413  // reconstruct 'columns' from backward_columns as a way to
414  // check that backward_columns is correct.
415  // they are reverse-direction maps, but we may need
416  // step.backward_columns.size() > 1 because of elements
417  // in the input that are duplicated in the temp matrix.
418  for (size_t k = 0; k < step.backward_columns.size(); k++) {
419  std::vector<int32> backward_columns;
420  step.backward_columns[k].CopyToVec(&backward_columns);
421  KALDI_ASSERT(int32(backward_columns.size()) ==
422  num_filters_in * height_in);
423  for (int32 l = 0; l < num_filters_in * height_in; l++) {
424  int32 c = backward_columns[l];
425  KALDI_ASSERT(c < int32(columns.size()));
426  if (c != -1) {
427  KALDI_ASSERT(columns_reconstructed[c] == -1);
428  columns_reconstructed[c] = l;
429  }
430  }
431  }
432  KALDI_ASSERT(columns_reconstructed == columns);
433  }
434  // check that all rows of the input were used.
435  KALDI_ASSERT(smallest_time_shift == 0 &&
436  largest_time_shift == num_extra_input_times);
437 
438  // check that the temp matrix is only allocated if it is required.
439  KALDI_ASSERT((temp_cols != 0) == temp_mat_required);
440 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void ComputeDerived ( )

Definition at line 1271 of file convolution.cc.

References ConvolutionComputation::ConvolutionStep::backward_columns, ConvolutionComputation::ConvolutionStep::columns, ConvolutionComputation::ConvolutionStep::columns_are_contiguous, CuArray< T >::CopyFromVec(), ConvolutionComputation::ConvolutionStep::first_column, ConvolutionComputation::height_in, ConvolutionComputation::ConvolutionStep::height_map, rnnlm::i, KALDI_ASSERT, ConvolutionComputation::num_filters_in, kaldi::nnet3::time_height_convolution::ReverseColumnMapping(), ConvolutionComputation::steps, ConvolutionComputation::temp_cols, and kaldi::nnet3::time_height_convolution::VectorIsContiguous().

Referenced by ConvolutionComputation::Read(), and kaldi::nnet3::time_height_convolution::UnPadModelHeight().

1271  {
1272  KALDI_ASSERT(!steps.empty());
1273 
1274  int32 input_dim = height_in * num_filters_in;
1275 
1276  int32 largest_required_temp_cols = 0;
1277  for (std::vector<ConvolutionStep>::iterator iter = steps.begin();
1278  iter != steps.end(); ++iter) {
1279  ConvolutionStep &step = *iter;
1280  std::vector<int32> columns;
1281  int32 temp_height = step.height_map.size();
1282  columns.resize(temp_height * num_filters_in);
1283  for (int32 h = 0; h < temp_height; h++) {
1284  KALDI_ASSERT(step.height_map[h] >= -1 && step.height_map[h] < height_in);
1285  if (step.height_map[h] != -1) {
1286  for (int32 f = 0; f < num_filters_in; f++)
1287  columns[h * num_filters_in + f] = step.height_map[h] * num_filters_in + f;
1288  } else {
1289  for (int32 f = 0; f < num_filters_in; f++)
1290  columns[h * num_filters_in + f] = -1;
1291  }
1292  }
1293  step.columns.CopyFromVec(columns);
1294  std::vector<std::vector<int32> > backward_columns;
1295  ReverseColumnMapping(columns, input_dim, &backward_columns);
1296  step.backward_columns.resize(backward_columns.size());
1297  for (size_t i = 0; i < backward_columns.size(); i++)
1298  step.backward_columns[i].CopyFromVec(backward_columns[i]);
1299 
1300  // we could replace height_map with columns in the line below and get the
1301  // same answer, but it would be a little slower.
1302  step.columns_are_contiguous =
1303  (step.height_map[0] != -1 && VectorIsContiguous(step.height_map));
1304  step.first_column = columns[0];
1305 
1306 
1307  bool need_temp_matrix =
1308  !(step.columns_are_contiguous && step.height_map[0] == 0 &&
1309  step.height_map.size() == height_in);
1310  if (need_temp_matrix) {
1311  largest_required_temp_cols = std::max<int32>(
1312  largest_required_temp_cols, static_cast<int32>(columns.size()));
1313  }
1314  }
1315  KALDI_ASSERT(temp_cols == largest_required_temp_cols);
1316 }
static bool VectorIsContiguous(const std::vector< int32 > &vec)
Definition: convolution.cc:77
static void ReverseColumnMapping(const std::vector< int32 > &columns, int32 input_dim, std::vector< std::vector< int32 > > *backward_columns)
This function, used in ConvolutionComputation::ComputeDerived(), reverses a mapping that may not be u...
Definition: convolution.cc:44
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void Read ( std::istream &  is,
bool  binary 
)

Definition at line 315 of file convolution.cc.

References ConvolutionComputation::Check(), ConvolutionComputation::ComputeDerived(), kaldi::nnet3::ExpectOneOrTwoTokens(), kaldi::nnet3::ExpectToken(), ConvolutionComputation::height_in, ConvolutionComputation::ConvolutionStep::height_map, ConvolutionComputation::height_out, ConvolutionComputation::ConvolutionStep::input_time_shift, ConvolutionComputation::num_filters_in, ConvolutionComputation::num_filters_out, ConvolutionComputation::num_images, ConvolutionComputation::num_t_in, ConvolutionComputation::num_t_out, ConvolutionComputation::ConvolutionStep::params_start_col, kaldi::ReadBasicType(), kaldi::ReadIntegerVector(), ConvolutionComputation::steps, ConvolutionComputation::temp_cols, and ConvolutionComputation::temp_rows.

Referenced by kaldi::nnet3::time_height_convolution::TestComputationIo().

315  {
316  ExpectOneOrTwoTokens(is, binary, "<ConvComputation>", "<NumFiltersInOut>");
317  ReadBasicType(is, binary, &num_filters_in);
318  ReadBasicType(is, binary, &num_filters_out);
319  ExpectToken(is, binary, "<HeightInOut>");
320  ReadBasicType(is, binary, &height_in);
321  ReadBasicType(is, binary, &height_out);
322  ExpectToken(is, binary, "<NumTInOut>");
323  ReadBasicType(is, binary, &num_t_in);
324  ReadBasicType(is, binary, &num_t_out);
325  ExpectToken(is, binary, "<NumImages>");
326  ReadBasicType(is, binary, &num_images);
327  ExpectToken(is, binary, "<TempRowsCols>");
328  ReadBasicType(is, binary, &temp_rows);
329  ReadBasicType(is, binary, &temp_cols);
330  int32 num_steps;
331  ExpectToken(is, binary, "<NumSteps>");
332  ReadBasicType(is, binary, &num_steps);
333  steps.resize(num_steps);
334  for (int32 s = 0; s < num_steps; s++) {
335  ConvolutionStep &step = steps[s];
336  ExpectToken(is, binary, "<TimeShift>");
337  ReadBasicType(is, binary, &step.input_time_shift);
338  ExpectToken(is, binary, "<ParamsStartCol>");
339  ReadBasicType(is, binary, &step.params_start_col);
340  ExpectToken(is, binary, "<HeightMap>");
341  ReadIntegerVector(is, binary, &step.height_map);
342  }
343  ExpectToken(is, binary, "</ConvComputation>");
344  ComputeDerived();
345  Check();
346 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ExpectOneOrTwoTokens(std::istream &is, bool binary, const std::string &token1, const std::string &token2)
This function is like ExpectToken but for two tokens, and it will either accept token1 and then token...
Definition: nnet-parse.cc:224
void ReadIntegerVector(std::istream &is, bool binary, std::vector< T > *v)
Function for reading STL vector of integer types.
Definition: io-funcs-inl.h:232
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
void Write ( std::ostream &  os,
bool  binary 
) const

Definition at line 283 of file convolution.cc.

References ConvolutionComputation::height_in, ConvolutionComputation::ConvolutionStep::height_map, ConvolutionComputation::height_out, ConvolutionComputation::ConvolutionStep::input_time_shift, ConvolutionComputation::num_filters_in, ConvolutionComputation::num_filters_out, ConvolutionComputation::num_images, ConvolutionComputation::num_t_in, ConvolutionComputation::num_t_out, ConvolutionComputation::ConvolutionStep::params_start_col, ConvolutionComputation::steps, ConvolutionComputation::temp_cols, ConvolutionComputation::temp_rows, kaldi::WriteBasicType(), kaldi::WriteIntegerVector(), and kaldi::WriteToken().

Referenced by kaldi::nnet3::time_height_convolution::TestComputationIo().

283  {
284  WriteToken(os, binary, "<ConvComputation>");
285  WriteToken(os, binary, "<NumFiltersInOut>");
286  WriteBasicType(os, binary, num_filters_in);
287  WriteBasicType(os, binary, num_filters_out);
288  WriteToken(os, binary, "<HeightInOut>");
289  WriteBasicType(os, binary, height_in);
290  WriteBasicType(os, binary, height_out);
291  WriteToken(os, binary, "<NumTInOut>");
292  WriteBasicType(os, binary, num_t_in);
293  WriteBasicType(os, binary, num_t_out);
294  WriteToken(os, binary, "<NumImages>");
295  WriteBasicType(os, binary, num_images);
296  WriteToken(os, binary, "<TempRowsCols>");
297  WriteBasicType(os, binary, temp_rows);
298  WriteBasicType(os, binary, temp_cols);
299  int32 num_steps = steps.size();
300  WriteToken(os, binary, "<NumSteps>");
301  WriteBasicType(os, binary, num_steps);
302  for (int32 s = 0; s < num_steps; s++) {
303  const ConvolutionStep &step = steps[s];
304  WriteToken(os, binary, "<TimeShift>");
305  WriteBasicType(os, binary, step.input_time_shift);
306  WriteToken(os, binary, "<ParamsStartCol>");
307  WriteBasicType(os, binary, step.params_start_col);
308  WriteToken(os, binary, "<HeightMap>");
309  WriteIntegerVector(os, binary, step.height_map);
310  }
311  WriteToken(os, binary, "</ConvComputation>");
312 }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteIntegerVector(std::ostream &os, bool binary, const std::vector< T > &v)
Function for writing STL vectors of integer types.
Definition: io-funcs-inl.h:198
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation


The documentation for this struct was generated from the following files: