ChunkInfo Class Reference

ChunkInfo is a class whose purpose is to describe the structure of matrices holding features. More...

#include <nnet-component.h>

Collaboration diagram for ChunkInfo:

Public Member Functions

 ChunkInfo ()
 
 ChunkInfo (int32 feat_dim, int32 num_chunks, int32 first_offset, int32 last_offset)
 
 ChunkInfo (int32 feat_dim, int32 num_chunks, const std::vector< int32 > offsets)
 
int32 GetIndex (int32 offset) const
 
int32 GetOffset (int32 index) const
 
void MakeOffsetsContiguous ()
 
int32 ChunkSize () const
 
int32 NumChunks () const
 
int32 NumRows () const
 Returns the number of rows that we expect the feature matrix to have. More...
 
int32 NumCols () const
 Returns the number of columns that we expect the feature matrix to have. More...
 
void CheckSize (const CuMatrixBase< BaseFloat > &mat) const
 Checks that the matrix has the size we expect, and die if not. More...
 
void Check () const
 Checks that the data in the ChunkInfo is valid, and die if not. More...
 

Private Attributes

int32 feat_dim_
 
int32 num_chunks_
 
int32 first_offset_
 
int32 last_offset_
 
std::vector< int32offsets_
 

Detailed Description

ChunkInfo is a class whose purpose is to describe the structure of matrices holding features.

This is useful mostly in training time. The main reason why we have this is to support efficient training for networks which we have splicing components that splice in a non-contiguous way, e.g. frames -5, 0 and 5. We also have in mind future extensibility to convnets which might have similar issues. This class describes the structure of a minibatch of features, or of a single contiguous block of features. Examples are as follows, and offsets is empty if not mentioned: When decoding, at input to the network: feat_dim = 13, num_chunks = 1, first_offset = 0, last_offset = 691 and in the middle of the network (assuming splicing is +-7): feat_dim = 1024, num_chunks = 1, first_offset = 7, last_offset = 684 When training, at input to the network: feat_dim = 13, num_chunks = 512, first_offset = 0, last_offset= 14 and in the middle of the network: feat_dim = 1024, num_chunks = 512, first_offset = 7, last_offset = 7 The only situation where offsets would be nonempty would be if we do splicing with gaps in. E.g. suppose at network input we splice +-2 frames (contiguous) and somewhere in the middle we splice frames {-5, 0, 5}, then we would have the following while training At input to the network: feat_dim = 13, num_chunks = 512, first_offset = 0, last_offset = 14 After the first hidden layer: feat_dim = 1024, num_chunks = 512, first_offset = 2, last_offset = 12, offsets = {2, 10, 12} At the output of the last hidden layer (after the {-5, 0, 5} splice): feat_dim = 1024, num_chunks = 512, first_offset = 7, last_offset = 7 (the decoding setup would still look pretty normal, so we don't give an example).

Definition at line 72 of file nnet-component.h.

Constructor & Destructor Documentation

◆ ChunkInfo() [1/3]

ChunkInfo ( )
inline

Definition at line 74 of file nnet-component.h.

◆ ChunkInfo() [2/3]

ChunkInfo ( int32  feat_dim,
int32  num_chunks,
int32  first_offset,
int32  last_offset 
)
inline

Definition at line 79 of file nnet-component.h.

References ChunkInfo::Check().

81  : feat_dim_(feat_dim), num_chunks_(num_chunks),
82  first_offset_(first_offset), last_offset_(last_offset),
83  offsets_() { Check(); }
std::vector< int32 > offsets_
void Check() const
Checks that the data in the ChunkInfo is valid, and die if not.

◆ ChunkInfo() [3/3]

ChunkInfo ( int32  feat_dim,
int32  num_chunks,
const std::vector< int32 offsets 
)
inline

Definition at line 85 of file nnet-component.h.

References ChunkInfo::Check(), ChunkInfo::first_offset_, ChunkInfo::GetIndex(), ChunkInfo::GetOffset(), ChunkInfo::last_offset_, and ChunkInfo::offsets_.

87  : feat_dim_(feat_dim), num_chunks_(num_chunks),
88  first_offset_(offsets.front()), last_offset_(offsets.back()),
89  offsets_(offsets) { if (last_offset_ - first_offset_ + 1 == offsets_.size())
90  offsets_.clear();
91  Check(); }
std::vector< int32 > offsets_
void Check() const
Checks that the data in the ChunkInfo is valid, and die if not.

Member Function Documentation

◆ Check()

void Check ( ) const

Checks that the data in the ChunkInfo is valid, and die if not.

Definition at line 2543 of file nnet-component.cc.

References KALDI_ASSERT.

Referenced by SpliceComponent::Backprop(), SpliceMaxComponent::Backprop(), ChunkInfo::ChunkInfo(), ChunkInfo::MakeOffsetsContiguous(), ChunkInfo::NumCols(), SpliceComponent::Propagate(), and SpliceMaxComponent::Propagate().

2543  {
2544  // Checking sanity of the ChunkInfo object
2545  KALDI_ASSERT((feat_dim_ > 0) && (num_chunks_ > 0));
2546 
2547  if (! offsets_.empty()) {
2548  KALDI_ASSERT((first_offset_ == offsets_.front()) &&
2549  (last_offset_ == offsets_.back()));
2550  } else {
2552  // asserting the chunk is not contiguous, as offsets is not empty
2553  KALDI_ASSERT ( last_offset_ - first_offset_ + 1 > offsets_.size() );
2554  }
2555  KALDI_ASSERT(NumRows() % num_chunks_ == 0);
2556 
2557 }
std::vector< int32 > offsets_
int32 NumRows() const
Returns the number of rows that we expect the feature matrix to have.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ CheckSize()

◆ ChunkSize()

int32 ChunkSize ( ) const
inline

Definition at line 115 of file nnet-component.h.

References ChunkInfo::num_chunks_, and ChunkInfo::NumRows().

Referenced by SpliceComponent::Backprop(), SpliceMaxComponent::Backprop(), SpliceComponent::Propagate(), and SpliceMaxComponent::Propagate().

115 { return NumRows() / num_chunks_; }
int32 NumRows() const
Returns the number of rows that we expect the feature matrix to have.

◆ GetIndex()

int32 GetIndex ( int32  offset) const

Definition at line 2519 of file nnet-component.cc.

References KALDI_ASSERT.

Referenced by SpliceComponent::Backprop(), SpliceMaxComponent::Backprop(), ChunkInfo::ChunkInfo(), SpliceComponent::Propagate(), and SpliceMaxComponent::Propagate().

2519  {
2520  if (offsets_.empty()) { // if data is contiguous
2521  KALDI_ASSERT((offset <= last_offset_) && (offset >= first_offset_));
2522  return offset - first_offset_;
2523  } else {
2524  std::vector<int32>::const_iterator iter =
2525  std::lower_bound(offsets_.begin(), offsets_.end(), offset);
2526  // make sure offset is present in the vector
2527  KALDI_ASSERT(iter != offsets_.end() && *iter == offset);
2528  return static_cast<int32>(iter - offsets_.begin());
2529  }
2530 }
std::vector< int32 > offsets_
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ GetOffset()

int32 GetOffset ( int32  index) const

Definition at line 2532 of file nnet-component.cc.

References KALDI_ASSERT.

Referenced by SpliceComponent::Backprop(), SpliceMaxComponent::Backprop(), ChunkInfo::ChunkInfo(), SpliceComponent::Propagate(), and SpliceMaxComponent::Propagate().

2532  {
2533  if (offsets_.empty()) { // if data is contiguous
2534  int32 offset = index + first_offset_; // just offset by the first_offset_
2535  KALDI_ASSERT((offset <= last_offset_) && (offset >= first_offset_));
2536  return offset;
2537  } else {
2538  KALDI_ASSERT((index >= 0) && (index < offsets_.size()));
2539  return offsets_[index];
2540  }
2541 }
std::vector< int32 > offsets_
kaldi::int32 int32
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ MakeOffsetsContiguous()

void MakeOffsetsContiguous ( )
inline

Definition at line 111 of file nnet-component.h.

References ChunkInfo::Check(), and ChunkInfo::offsets_.

111 { offsets_.clear(); Check(); }
std::vector< int32 > offsets_
void Check() const
Checks that the data in the ChunkInfo is valid, and die if not.

◆ NumChunks()

◆ NumCols()

◆ NumRows()

int32 NumRows ( ) const
inline

Member Data Documentation

◆ feat_dim_

int32 feat_dim_
private

Definition at line 135 of file nnet-component.h.

Referenced by ChunkInfo::NumCols().

◆ first_offset_

int32 first_offset_
private

Definition at line 137 of file nnet-component.h.

Referenced by ChunkInfo::ChunkInfo(), and ChunkInfo::NumRows().

◆ last_offset_

int32 last_offset_
private

Definition at line 140 of file nnet-component.h.

Referenced by ChunkInfo::ChunkInfo(), and ChunkInfo::NumRows().

◆ num_chunks_

int32 num_chunks_
private

◆ offsets_

std::vector<int32> offsets_
private

The documentation for this class was generated from the following files: