All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
ConvolutionComputation Struct Reference

This struct represents the structure of a convolution computation. More...

#include <convolution.h>

Collaboration diagram for ConvolutionComputation:

Classes

struct  ConvolutionStep
 

Public Member Functions

void Write (std::ostream &os, bool binary) const
 
void Read (std::istream &is, bool binary)
 
void ComputeDerived ()
 
void Check () const
 

Public Attributes

int32 num_filters_in
 
int32 num_filters_out
 
int32 height_in
 
int32 height_out
 
int32 num_t_in
 
int32 num_t_out
 
int32 num_images
 
int32 temp_rows
 
int32 temp_cols
 
std::vector< ConvolutionStepsteps
 

Detailed Description

This struct represents the structure of a convolution computation.

This is used inside the PrecomputedIndexes object for the TimeHeightConvolutionComponent (it depends on the inputs and outputs as well as the layer).

CAUTION*: this is after certain transformations of the problem, so the height_in may not always be the "real" height of the input image (it may be a multiple thereof), and the num_t_in may not always be the "real" number of distinct time-steps on the input of the computation (it may be a divisor thereof). ConvolutionComputation contains the info needed to actually perform the computation.

Definition at line 252 of file convolution.h.

Member Function Documentation

void Check ( ) const

Definition at line 348 of file convolution.cc.

References ConvolutionComputation::ConvolutionStep::backward_columns, ConvolutionComputation::ConvolutionStep::columns, ConvolutionComputation::ConvolutionStep::columns_are_contiguous, CuArray< T >::CopyToVec(), CuArray< T >::Dim(), ConvolutionComputation::ConvolutionStep::first_column, ConvolutionComputation::height_in, ConvolutionComputation::ConvolutionStep::height_map, ConvolutionComputation::height_out, rnnlm::i, ConvolutionComputation::ConvolutionStep::input_time_shift, KALDI_ASSERT, ConvolutionComputation::num_filters_in, ConvolutionComputation::num_filters_out, ConvolutionComputation::num_images, ConvolutionComputation::num_t_in, ConvolutionComputation::num_t_out, ConvolutionComputation::ConvolutionStep::params_start_col, ConvolutionComputation::steps, ConvolutionComputation::temp_cols, and ConvolutionComputation::temp_rows.

Referenced by ConvolutionComputation::Read(), and kaldi::nnet3::time_height_convolution::UnPadModelHeight().

348  {
350  height_in > 0 && height_out > 0);
352  num_t_out > 0 && num_images > 0);
353  KALDI_ASSERT((temp_rows == 0 && temp_cols == 0) ||
355  temp_cols > 0));
357  bool temp_mat_required = false;
358  int32 num_steps = steps.size();
359  int32 num_extra_input_times = num_t_in - num_t_out,
360  input_cols = num_filters_in * height_in,
361  smallest_time_shift = 1000,
362  largest_time_shift = 0;
363  // check 'steps'
364  for (int32 s = 0; s < num_steps; s++) {
365  const ConvolutionStep &step = steps[s];
366  KALDI_ASSERT(step.input_time_shift >= 0 &&
367  step.input_time_shift <= num_extra_input_times);
368  if (step.input_time_shift < smallest_time_shift)
369  smallest_time_shift = step.input_time_shift;
370  if (step.input_time_shift > largest_time_shift)
371  largest_time_shift = step.input_time_shift;
372  KALDI_ASSERT(step.params_start_col >= 0 &&
373  step.params_start_col % num_filters_in == 0);
374  if (s != 0) {
375  KALDI_ASSERT(step.input_time_shift != steps[s-1].input_time_shift);
376  }
377  std::vector<int32> columns;
378  step.columns.CopyToVec(&columns);
379  KALDI_ASSERT(step.first_column == columns[0]);
380  KALDI_ASSERT(step.columns.Dim() == step.height_map.size() * num_filters_in);
381  bool all_negative = true;
382  int32 temp_height = step.height_map.size();
383  bool contiguous = true;
384  for (int32 i = 0; i < temp_height; i++) {
385  int32 h = step.height_map[i];
386  KALDI_ASSERT(h >= -1 && h < height_in);
387  if (i > 0 && step.height_map[i-1] != h-1)
388  contiguous = false;
389  if (h == -1) {
390  contiguous = false;
391  for (int32 f = 0; f < num_filters_in; f++) {
392  KALDI_ASSERT(columns[i * num_filters_in + f] == -1);
393  }
394  } else {
395  all_negative = false;
396  for (int32 f = 0; f < num_filters_in; f++) {
397  KALDI_ASSERT(columns[i * num_filters_in + f] ==
398  h * num_filters_in + f);
399  }
400  }
401  }
402  KALDI_ASSERT(contiguous == step.columns_are_contiguous);
403  if (!contiguous || columns.size() != input_cols) {
404  // we would need the temporary matrix. Make sure the
405  // temporary matrix is big enough.
406  temp_mat_required = true;
407  KALDI_ASSERT(columns.size() <= temp_cols);
408  }
409  KALDI_ASSERT(!all_negative);
410 
411  std::vector<int32> columns_reconstructed(columns.size(), -1);
412  // reconstruct 'columns' from backward_columns as a way to
413  // check that backward_columns is correct.
414  // they are reverse-direction maps, but we may need
415  // step.backward_columns.size() > 1 because of elements
416  // in the input that are duplicated in the temp matrix.
417  for (size_t k = 0; k < step.backward_columns.size(); k++) {
418  std::vector<int32> backward_columns;
419  step.backward_columns[k].CopyToVec(&backward_columns);
420  KALDI_ASSERT(int32(backward_columns.size()) ==
421  num_filters_in * height_in);
422  for (int32 l = 0; l < num_filters_in * height_in; l++) {
423  int32 c = backward_columns[l];
424  KALDI_ASSERT(c < int32(columns.size()));
425  if (c != -1) {
426  KALDI_ASSERT(columns_reconstructed[c] == -1);
427  columns_reconstructed[c] = l;
428  }
429  }
430  }
431  KALDI_ASSERT(columns_reconstructed == columns);
432  }
433  // check that all rows of the input were used.
434  KALDI_ASSERT(smallest_time_shift == 0 &&
435  largest_time_shift == num_extra_input_times);
436 
437  // check that the temp matrix is only allocated if it is required.
438  KALDI_ASSERT((temp_cols != 0) == temp_mat_required);
439 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void ComputeDerived ( )

Definition at line 1270 of file convolution.cc.

References ConvolutionComputation::ConvolutionStep::backward_columns, ConvolutionComputation::ConvolutionStep::columns, ConvolutionComputation::ConvolutionStep::columns_are_contiguous, CuArray< T >::CopyFromVec(), ConvolutionComputation::ConvolutionStep::first_column, ConvolutionComputation::height_in, ConvolutionComputation::ConvolutionStep::height_map, rnnlm::i, KALDI_ASSERT, ConvolutionComputation::num_filters_in, kaldi::nnet3::time_height_convolution::ReverseColumnMapping(), ConvolutionComputation::steps, ConvolutionComputation::temp_cols, and kaldi::nnet3::time_height_convolution::VectorIsContiguous().

Referenced by ConvolutionComputation::Read(), and kaldi::nnet3::time_height_convolution::UnPadModelHeight().

1270  {
1271  KALDI_ASSERT(!steps.empty());
1272 
1273  int32 input_dim = height_in * num_filters_in;
1274 
1275  int32 largest_required_temp_cols = 0;
1276  for (std::vector<ConvolutionStep>::iterator iter = steps.begin();
1277  iter != steps.end(); ++iter) {
1278  ConvolutionStep &step = *iter;
1279  std::vector<int32> columns;
1280  int32 temp_height = step.height_map.size();
1281  columns.resize(temp_height * num_filters_in);
1282  for (int32 h = 0; h < temp_height; h++) {
1283  KALDI_ASSERT(step.height_map[h] >= -1 && step.height_map[h] < height_in);
1284  if (step.height_map[h] != -1) {
1285  for (int32 f = 0; f < num_filters_in; f++)
1286  columns[h * num_filters_in + f] = step.height_map[h] * num_filters_in + f;
1287  } else {
1288  for (int32 f = 0; f < num_filters_in; f++)
1289  columns[h * num_filters_in + f] = -1;
1290  }
1291  }
1292  step.columns.CopyFromVec(columns);
1293  std::vector<std::vector<int32> > backward_columns;
1294  ReverseColumnMapping(columns, input_dim, &backward_columns);
1295  step.backward_columns.resize(backward_columns.size());
1296  for (size_t i = 0; i < backward_columns.size(); i++)
1297  step.backward_columns[i].CopyFromVec(backward_columns[i]);
1298 
1299  // we could replace height_map with columns in the line below and get the
1300  // same answer, but it would be a little slower.
1301  step.columns_are_contiguous =
1302  (step.height_map[0] != -1 && VectorIsContiguous(step.height_map));
1303  step.first_column = columns[0];
1304 
1305 
1306  bool need_temp_matrix =
1307  !(step.columns_are_contiguous && step.height_map[0] == 0 &&
1308  step.height_map.size() == height_in);
1309  if (need_temp_matrix) {
1310  largest_required_temp_cols = std::max<int32>(
1311  largest_required_temp_cols, static_cast<int32>(columns.size()));
1312  }
1313  }
1314  KALDI_ASSERT(temp_cols == largest_required_temp_cols);
1315 }
static bool VectorIsContiguous(const std::vector< int32 > &vec)
Definition: convolution.cc:76
static void ReverseColumnMapping(const std::vector< int32 > &columns, int32 input_dim, std::vector< std::vector< int32 > > *backward_columns)
This function, used in ConvolutionComputation::ComputeDerived(), reverses a mapping that may not be u...
Definition: convolution.cc:43
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void Read ( std::istream &  is,
bool  binary 
)

Definition at line 314 of file convolution.cc.

References ConvolutionComputation::Check(), ConvolutionComputation::ComputeDerived(), kaldi::nnet3::ExpectOneOrTwoTokens(), kaldi::nnet3::ExpectToken(), ConvolutionComputation::height_in, ConvolutionComputation::ConvolutionStep::height_map, ConvolutionComputation::height_out, ConvolutionComputation::ConvolutionStep::input_time_shift, ConvolutionComputation::num_filters_in, ConvolutionComputation::num_filters_out, ConvolutionComputation::num_images, ConvolutionComputation::num_t_in, ConvolutionComputation::num_t_out, ConvolutionComputation::ConvolutionStep::params_start_col, kaldi::ReadBasicType(), kaldi::ReadIntegerVector(), ConvolutionComputation::steps, ConvolutionComputation::temp_cols, and ConvolutionComputation::temp_rows.

Referenced by kaldi::nnet3::time_height_convolution::TestComputationIo().

314  {
315  ExpectOneOrTwoTokens(is, binary, "<ConvComputation>", "<NumFiltersInOut>");
316  ReadBasicType(is, binary, &num_filters_in);
317  ReadBasicType(is, binary, &num_filters_out);
318  ExpectToken(is, binary, "<HeightInOut>");
319  ReadBasicType(is, binary, &height_in);
320  ReadBasicType(is, binary, &height_out);
321  ExpectToken(is, binary, "<NumTInOut>");
322  ReadBasicType(is, binary, &num_t_in);
323  ReadBasicType(is, binary, &num_t_out);
324  ExpectToken(is, binary, "<NumImages>");
325  ReadBasicType(is, binary, &num_images);
326  ExpectToken(is, binary, "<TempRowsCols>");
327  ReadBasicType(is, binary, &temp_rows);
328  ReadBasicType(is, binary, &temp_cols);
329  int32 num_steps;
330  ExpectToken(is, binary, "<NumSteps>");
331  ReadBasicType(is, binary, &num_steps);
332  steps.resize(num_steps);
333  for (int32 s = 0; s < num_steps; s++) {
334  ConvolutionStep &step = steps[s];
335  ExpectToken(is, binary, "<TimeShift>");
336  ReadBasicType(is, binary, &step.input_time_shift);
337  ExpectToken(is, binary, "<ParamsStartCol>");
338  ReadBasicType(is, binary, &step.params_start_col);
339  ExpectToken(is, binary, "<HeightMap>");
340  ReadIntegerVector(is, binary, &step.height_map);
341  }
342  ExpectToken(is, binary, "</ConvComputation>");
343  ComputeDerived();
344  Check();
345 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ExpectOneOrTwoTokens(std::istream &is, bool binary, const std::string &token1, const std::string &token2)
This function is like ExpectToken but for two tokens, and it will either accept token1 and then token...
Definition: nnet-parse.cc:224
void ReadIntegerVector(std::istream &is, bool binary, std::vector< T > *v)
Function for reading STL vector of integer types.
Definition: io-funcs-inl.h:232
static void ExpectToken(const std::string &token, const std::string &what_we_are_parsing, const std::string **next_token)
void Write ( std::ostream &  os,
bool  binary 
) const

Definition at line 282 of file convolution.cc.

References ConvolutionComputation::height_in, ConvolutionComputation::ConvolutionStep::height_map, ConvolutionComputation::height_out, ConvolutionComputation::ConvolutionStep::input_time_shift, ConvolutionComputation::num_filters_in, ConvolutionComputation::num_filters_out, ConvolutionComputation::num_images, ConvolutionComputation::num_t_in, ConvolutionComputation::num_t_out, ConvolutionComputation::ConvolutionStep::params_start_col, ConvolutionComputation::steps, ConvolutionComputation::temp_cols, ConvolutionComputation::temp_rows, kaldi::WriteBasicType(), kaldi::WriteIntegerVector(), and kaldi::WriteToken().

Referenced by kaldi::nnet3::time_height_convolution::TestComputationIo().

282  {
283  WriteToken(os, binary, "<ConvComputation>");
284  WriteToken(os, binary, "<NumFiltersInOut>");
285  WriteBasicType(os, binary, num_filters_in);
286  WriteBasicType(os, binary, num_filters_out);
287  WriteToken(os, binary, "<HeightInOut>");
288  WriteBasicType(os, binary, height_in);
289  WriteBasicType(os, binary, height_out);
290  WriteToken(os, binary, "<NumTInOut>");
291  WriteBasicType(os, binary, num_t_in);
292  WriteBasicType(os, binary, num_t_out);
293  WriteToken(os, binary, "<NumImages>");
294  WriteBasicType(os, binary, num_images);
295  WriteToken(os, binary, "<TempRowsCols>");
296  WriteBasicType(os, binary, temp_rows);
297  WriteBasicType(os, binary, temp_cols);
298  int32 num_steps = steps.size();
299  WriteToken(os, binary, "<NumSteps>");
300  WriteBasicType(os, binary, num_steps);
301  for (int32 s = 0; s < num_steps; s++) {
302  const ConvolutionStep &step = steps[s];
303  WriteToken(os, binary, "<TimeShift>");
304  WriteBasicType(os, binary, step.input_time_shift);
305  WriteToken(os, binary, "<ParamsStartCol>");
306  WriteBasicType(os, binary, step.params_start_col);
307  WriteToken(os, binary, "<HeightMap>");
308  WriteIntegerVector(os, binary, step.height_map);
309  }
310  WriteToken(os, binary, "</ConvComputation>");
311 }
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteIntegerVector(std::ostream &os, bool binary, const std::vector< T > &v)
Function for writing STL vectors of integer types.
Definition: io-funcs-inl.h:198
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation


The documentation for this struct was generated from the following files: