All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
MatrixRandomizer Class Reference

Shuffles rows of a matrix according to the indices in the mask,. More...

#include <nnet-randomizer.h>

Collaboration diagram for MatrixRandomizer:

Public Member Functions

 MatrixRandomizer ()
 
 MatrixRandomizer (const NnetDataRandomizerOptions &conf)
 
void Init (const NnetDataRandomizerOptions &conf)
 Set the randomizer parameters (size) More...
 
void AddData (const CuMatrixBase< BaseFloat > &m)
 Add data to randomization buffer. More...
 
bool IsFull ()
 Returns true, when capacity is full. More...
 
int32 NumFrames ()
 Number of frames stored inside the Randomizer. More...
 
void Randomize (const std::vector< int32 > &mask)
 Randomize matrix row-order using mask. More...
 
bool Done ()
 Returns true, if no more data for another mini-batch (after current one) More...
 
void Next ()
 Sets cursor to next mini-batch. More...
 
const CuMatrixBase< BaseFloat > & Value ()
 Returns matrix-window with next mini-batch. More...
 

Private Attributes

CuMatrix< BaseFloatdata_
 
CuMatrix< BaseFloatdata_aux_
 
CuMatrix< BaseFloatminibatch_
 
int32 data_begin_
 A cursor, pointing to the 'row' where the next mini-batch begins,. More...
 
int32 data_end_
 A cursor, pointing to the 'row' after the end of data,. More...
 
NnetDataRandomizerOptions conf_
 

Detailed Description

Shuffles rows of a matrix according to the indices in the mask,.

Definition at line 87 of file nnet-randomizer.h.

Constructor & Destructor Documentation

MatrixRandomizer ( )
inline

Definition at line 89 of file nnet-randomizer.h.

89  :
90  data_begin_(0),
91  data_end_(0)
92  { }
int32 data_begin_
A cursor, pointing to the 'row' where the next mini-batch begins,.
int32 data_end_
A cursor, pointing to the 'row' after the end of data,.
MatrixRandomizer ( const NnetDataRandomizerOptions conf)
inlineexplicit

Definition at line 94 of file nnet-randomizer.h.

References MatrixRandomizer::Init().

94  :
95  data_begin_(0),
96  data_end_(0)
97  {
98  Init(conf);
99  }
int32 data_begin_
A cursor, pointing to the 'row' where the next mini-batch begins,.
void Init(const NnetDataRandomizerOptions &conf)
Set the randomizer parameters (size)
int32 data_end_
A cursor, pointing to the 'row' after the end of data,.

Member Function Documentation

void AddData ( const CuMatrixBase< BaseFloat > &  m)

Add data to randomization buffer.

Definition at line 47 of file nnet-randomizer.cc.

References MatrixRandomizer::conf_, MatrixRandomizer::data_, MatrixRandomizer::data_begin_, MatrixRandomizer::data_end_, KALDI_ASSERT, CuMatrixBase< Real >::NumCols(), CuMatrixBase< Real >::NumRows(), NnetDataRandomizerOptions::randomizer_size, CuMatrix< Real >::Resize(), and CuMatrixBase< Real >::RowRange().

Referenced by main(), and UnitTestMatrixRandomizer().

47  {
48  // pre-allocate before 1st use
49  if (data_.NumCols() == 0) {
51  }
52  // optionally put previous left-over to front
53  if (data_begin_ > 0) {
54  KALDI_ASSERT(data_begin_ <= data_end_); // sanity check,
55  int32 leftover = data_end_ - data_begin_;
56  KALDI_ASSERT(leftover < data_begin_); // no overlap,
57  if (leftover > 0) {
58  data_.RowRange(0, leftover).CopyFromMat(data_.RowRange(data_begin_, leftover));
59  }
60  data_begin_ = 0;
61  data_end_ = leftover;
62  // set zero to the rest of the buffer,
63  data_.RowRange(leftover, data_.NumRows() - leftover).SetZero();
64  }
65  // extend the buffer if necessary,
66  if (data_.NumRows() < data_end_ + m.NumRows()) {
67  CuMatrix<BaseFloat> data_aux(data_);
68  // Add extra 1000 rows, so we don't reallocate soon:
69  data_.Resize(data_end_ + m.NumRows() + 1000, data_.NumCols());
70  data_.RowRange(0, data_aux.NumRows()).CopyFromMat(data_aux);
71  }
72  // copy the data
73  data_.RowRange(data_end_, m.NumRows()).CopyFromMat(m);
74  data_end_ += m.NumRows();
75 }
int32 data_begin_
A cursor, pointing to the 'row' where the next mini-batch begins,.
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
CuSubMatrix< Real > RowRange(const MatrixIndexT row_offset, const MatrixIndexT num_rows) const
Definition: cu-matrix.h:539
int32 randomizer_size
Maximum number of samples we have in memory,.
CuMatrix< BaseFloat > data_
void Resize(MatrixIndexT rows, MatrixIndexT cols, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Allocate the memory.
Definition: cu-matrix.cc:47
MatrixIndexT NumRows() const
Dimensions.
Definition: cu-matrix.h:195
int32 data_end_
A cursor, pointing to the 'row' after the end of data,.
NnetDataRandomizerOptions conf_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
bool Done ( )
inline

Returns true, if no more data for another mini-batch (after current one)

Definition at line 123 of file nnet-randomizer.h.

References MatrixRandomizer::conf_, MatrixRandomizer::data_begin_, MatrixRandomizer::data_end_, and NnetDataRandomizerOptions::minibatch_size.

Referenced by main(), and UnitTestMatrixRandomizer().

123  {
125  }
int32 data_begin_
A cursor, pointing to the 'row' where the next mini-batch begins,.
int32 data_end_
A cursor, pointing to the 'row' after the end of data,.
NnetDataRandomizerOptions conf_
void Init ( const NnetDataRandomizerOptions conf)
inline

Set the randomizer parameters (size)

Definition at line 102 of file nnet-randomizer.h.

References MatrixRandomizer::conf_.

Referenced by MatrixRandomizer::MatrixRandomizer(), and UnitTestMatrixRandomizer().

102  {
103  conf_ = conf;
104  }
NnetDataRandomizerOptions conf_
bool IsFull ( )
inline

Returns true, when capacity is full.

Definition at line 110 of file nnet-randomizer.h.

References MatrixRandomizer::conf_, MatrixRandomizer::data_begin_, MatrixRandomizer::data_end_, and NnetDataRandomizerOptions::randomizer_size.

Referenced by main(), and UnitTestMatrixRandomizer().

110  {
111  return ((data_begin_ == 0) && (data_end_ > conf_.randomizer_size ));
112  }
int32 data_begin_
A cursor, pointing to the 'row' where the next mini-batch begins,.
int32 randomizer_size
Maximum number of samples we have in memory,.
int32 data_end_
A cursor, pointing to the 'row' after the end of data,.
NnetDataRandomizerOptions conf_
void Next ( )

Sets cursor to next mini-batch.

Definition at line 96 of file nnet-randomizer.cc.

References MatrixRandomizer::conf_, MatrixRandomizer::data_begin_, and NnetDataRandomizerOptions::minibatch_size.

Referenced by main(), and UnitTestMatrixRandomizer().

96  {
98 }
int32 data_begin_
A cursor, pointing to the 'row' where the next mini-batch begins,.
NnetDataRandomizerOptions conf_
int32 NumFrames ( )
inline

Number of frames stored inside the Randomizer.

Definition at line 115 of file nnet-randomizer.h.

References MatrixRandomizer::data_end_.

Referenced by main(), and UnitTestMatrixRandomizer().

115  {
116  return data_end_;
117  }
int32 data_end_
A cursor, pointing to the 'row' after the end of data,.
void Randomize ( const std::vector< int32 > &  mask)

Randomize matrix row-order using mask.

Definition at line 77 of file nnet-randomizer.cc.

References CuArray< T >::CopyFromVec(), MatrixRandomizer::data_, MatrixRandomizer::data_aux_, MatrixRandomizer::data_begin_, MatrixRandomizer::data_end_, KALDI_ASSERT, and kaldi::cu::Randomize().

Referenced by main(), and UnitTestMatrixRandomizer().

77  {
80  KALDI_ASSERT(data_end_ == mask.size());
81  // Copy to auxiliary buffer for unshuffled data
82  data_aux_ = data_;
83  // Put the mask to GPU
84  CuArray<int32> mask_in_gpu(mask.size());
85  mask_in_gpu.CopyFromVec(mask);
86  // Randomize the data, mask is used to index rows in source matrix:
87  // (Here the vector 'mask_in_gpu' is typically shorter than number
88  // of rows in 'data_aux_', because the buffer 'data_aux_'
89  // is larger than capacity 'randomizer_size'.
90  // The extra rows in 'data_aux_' do not contain speech frames and
91  // are not copied from 'data_aux_', the extra rows in 'data_' are
92  // unchanged by cu::Randomize.)
93  cu::Randomize(data_aux_, mask_in_gpu, &data_);
94 }
int32 data_begin_
A cursor, pointing to the 'row' where the next mini-batch begins,.
void Randomize(const CuMatrixBase< Real > &src, const CuArray< int32 > &copy_from_idx, CuMatrixBase< Real > *tgt)
Copies a permutation of src into tgt.
Definition: cu-math.cc:80
CuMatrix< BaseFloat > data_
int32 data_end_
A cursor, pointing to the 'row' after the end of data,.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
CuMatrix< BaseFloat > data_aux_
const CuMatrixBase< BaseFloat > & Value ( )

Returns matrix-window with next mini-batch.

Definition at line 100 of file nnet-randomizer.cc.

References MatrixRandomizer::conf_, CuMatrixBase< Real >::CopyFromMat(), MatrixRandomizer::data_, MatrixRandomizer::data_begin_, MatrixRandomizer::data_end_, KALDI_ASSERT, kaldi::kUndefined, MatrixRandomizer::minibatch_, NnetDataRandomizerOptions::minibatch_size, CuMatrixBase< Real >::NumCols(), CuMatrix< Real >::Resize(), and CuMatrixBase< Real >::RowRange().

Referenced by main(), and UnitTestMatrixRandomizer().

100  {
101  // make sure we have data for next minibatch,
103  // prepare the mini-batch buffer,
106  return minibatch_;
107 }
int32 data_begin_
A cursor, pointing to the 'row' where the next mini-batch begins,.
MatrixIndexT NumCols() const
Definition: cu-matrix.h:196
void CopyFromMat(const MatrixBase< OtherReal > &src, MatrixTransposeType trans=kNoTrans)
Definition: cu-matrix.cc:337
CuSubMatrix< Real > RowRange(const MatrixIndexT row_offset, const MatrixIndexT num_rows) const
Definition: cu-matrix.h:539
CuMatrix< BaseFloat > data_
void Resize(MatrixIndexT rows, MatrixIndexT cols, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Allocate the memory.
Definition: cu-matrix.cc:47
int32 data_end_
A cursor, pointing to the 'row' after the end of data,.
NnetDataRandomizerOptions conf_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
CuMatrix< BaseFloat > minibatch_

Member Data Documentation

CuMatrix<BaseFloat> data_aux_
private

Definition at line 135 of file nnet-randomizer.h.

Referenced by MatrixRandomizer::Randomize().

int32 data_begin_
private

A cursor, pointing to the 'row' where the next mini-batch begins,.

Definition at line 139 of file nnet-randomizer.h.

Referenced by MatrixRandomizer::AddData(), MatrixRandomizer::Done(), MatrixRandomizer::IsFull(), MatrixRandomizer::Next(), MatrixRandomizer::Randomize(), and MatrixRandomizer::Value().

int32 data_end_
private
CuMatrix<BaseFloat> minibatch_
private

Definition at line 136 of file nnet-randomizer.h.

Referenced by MatrixRandomizer::Value().


The documentation for this class was generated from the following files: