DecodableNnetLoopedOnlineBase Class Reference

#include <decodable-online-looped.h>

Inheritance diagram for DecodableNnetLoopedOnlineBase:
Collaboration diagram for DecodableNnetLoopedOnlineBase:

Public Member Functions

 DecodableNnetLoopedOnlineBase (const DecodableNnetSimpleLoopedInfo &info, OnlineFeatureInterface *input_features, OnlineFeatureInterface *ivector_features)
 
virtual bool IsLastFrame (int32 subsampled_frame) const
 Returns true if this is the last frame. More...
 
virtual int32 NumFramesReady () const
 The call NumFramesReady() will return the number of frames currently available for this decodable object. More...
 
int32 FrameSubsamplingFactor () const
 
void SetFrameOffset (int32 frame_offset)
 Sets the frame offset value. More...
 
int32 GetFrameOffset () const
 Returns the frame offset value. More...
 
- Public Member Functions inherited from DecodableInterface
virtual BaseFloat LogLikelihood (int32 frame, int32 index)=0
 Returns the log likelihood, which will be negated in the decoder. More...
 
virtual int32 NumIndices () const =0
 Returns the number of states in the acoustic model (they will be indexed one-based, i.e. More...
 
virtual ~DecodableInterface ()
 

Protected Member Functions

void EnsureFrameIsComputed (int32 subsampled_frame)
 If the neural-network outputs for this frame are not cached, this function computes them (and possibly also some later frames). More...
 

Protected Attributes

Matrix< BaseFloatcurrent_log_post_
 
int32 num_chunks_computed_
 
int32 current_log_post_subsampled_offset_
 
const DecodableNnetSimpleLoopedInfoinfo_
 
int32 frame_offset_
 

Private Member Functions

void AdvanceChunk ()
 
 KALDI_DISALLOW_COPY_AND_ASSIGN (DecodableNnetLoopedOnlineBase)
 

Private Attributes

OnlineFeatureInterfaceinput_features_
 
OnlineFeatureInterfaceivector_features_
 
NnetComputer computer_
 

Detailed Description

Definition at line 56 of file decodable-online-looped.h.

Constructor & Destructor Documentation

◆ DecodableNnetLoopedOnlineBase()

DecodableNnetLoopedOnlineBase ( const DecodableNnetSimpleLoopedInfo info,
OnlineFeatureInterface input_features,
OnlineFeatureInterface ivector_features 
)

Definition at line 26 of file decodable-online-looped.cc.

References OnlineFeatureInterface::Dim(), DecodableNnetLoopedOnlineBase::info_, DecodableNnetLoopedOnlineBase::input_features_, Nnet::InputDim(), DecodableNnetLoopedOnlineBase::ivector_features_, KALDI_ASSERT, KALDI_ERR, and DecodableNnetSimpleLoopedInfo::nnet.

29  :
32  info_(info),
33  frame_offset_(0),
34  input_features_(input_features),
35  ivector_features_(ivector_features),
37  info_.nnet, NULL) { // NULL is 'nnet_to_update'
38  // Check that feature dimensions match.
40  int32 nnet_input_dim = info_.nnet.InputDim("input"),
41  nnet_ivector_dim = info_.nnet.InputDim("ivector"),
42  feat_input_dim = input_features_->Dim(),
43  feat_ivector_dim = (ivector_features_ != NULL ?
44  ivector_features_->Dim() : -1);
45  if (nnet_input_dim != feat_input_dim) {
46  KALDI_ERR << "Input feature dimension mismatch: got " << feat_input_dim
47  << " but network expects " << nnet_input_dim;
48  }
49  if (nnet_ivector_dim != feat_ivector_dim) {
50  KALDI_ERR << "Ivector feature dimension mismatch: got " << feat_ivector_dim
51  << " but network expects " << nnet_ivector_dim;
52  }
53 }
int32 InputDim(const std::string &input_name) const
Definition: nnet-nnet.cc:669
kaldi::int32 int32
const NnetSimpleLoopedComputationOptions & opts
const DecodableNnetSimpleLoopedInfo & info_
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
virtual int32 Dim() const =0

Member Function Documentation

◆ AdvanceChunk()

void AdvanceChunk ( )
private

Definition at line 118 of file decodable-online-looped.cc.

References NnetComputer::AcceptInput(), NnetSimpleLoopedComputationOptions::acoustic_scale, CuMatrixBase< Real >::AddVecToRows(), DecodableNnetLoopedOnlineBase::computer_, MatrixBase< Real >::CopyRowsFromVec(), DecodableNnetLoopedOnlineBase::current_log_post_, DecodableNnetLoopedOnlineBase::current_log_post_subsampled_offset_, OnlineFeatureInterface::Dim(), NnetSimpleLoopedComputationOptions::frame_subsampling_factor, DecodableNnetSimpleLoopedInfo::frames_left_context, DecodableNnetSimpleLoopedInfo::frames_per_chunk, DecodableNnetSimpleLoopedInfo::frames_right_context, OnlineFeatureInterface::GetFrame(), NnetComputer::GetOutputDestructive(), DecodableNnetSimpleLoopedInfo::has_ivectors, rnnlm::i, DecodableNnetLoopedOnlineBase::info_, DecodableNnetLoopedOnlineBase::input_features_, ComputationRequest::inputs, OnlineFeatureInterface::IsLastFrame(), DecodableNnetLoopedOnlineBase::ivector_features_, KALDI_ASSERT, KALDI_ERR, DecodableNnetSimpleLoopedInfo::log_priors, DecodableNnetLoopedOnlineBase::num_chunks_computed_, MatrixBase< Real >::NumCols(), OnlineFeatureInterface::NumFramesReady(), MatrixBase< Real >::NumRows(), DecodableNnetSimpleLoopedInfo::opts, DecodableNnetSimpleLoopedInfo::output_dim, DecodableNnetSimpleLoopedInfo::request1, DecodableNnetSimpleLoopedInfo::request2, Matrix< Real >::Resize(), NnetComputer::Run(), CuMatrixBase< Real >::Scale(), Matrix< Real >::Swap(), and CuMatrix< Real >::Swap().

Referenced by DecodableNnetLoopedOnlineBase::EnsureFrameIsComputed().

118  {
119  // Prepare the input data for the next chunk of features.
120  // note: 'end' means one past the last.
121  int32 begin_input_frame, end_input_frame;
122  if (num_chunks_computed_ == 0) {
123  begin_input_frame = -info_.frames_left_context;
124  // note: end is last plus one.
125  end_input_frame = info_.frames_per_chunk + info_.frames_right_context;
126  } else {
127  // note: begin_input_frame will be the same as the previous end_input_frame.
128  // you can verify this directly if num_chunks_computed_ == 0, and then by
129  // induction.
130  begin_input_frame = num_chunks_computed_ * info_.frames_per_chunk +
132  end_input_frame = begin_input_frame + info_.frames_per_chunk;
133  }
134 
135  int32 num_feature_frames_ready = input_features_->NumFramesReady();
136  bool is_finished = input_features_->IsLastFrame(num_feature_frames_ready - 1);
137 
138  if (end_input_frame > num_feature_frames_ready && !is_finished) {
139  // we shouldn't be attempting to read past the end of the available features
140  // until we have reached the end of the input (i.e. the end-user called
141  // InputFinished(), announcing that there is no more waveform; at this point
142  // we pad as needed with copies of the last frame, to flush out the last of
143  // the output.
144  // If the following error happens, it likely indicates a bug in this
145  // decodable code somewhere (although it could possibly indicate the
146  // user asking for a frame that was not ready, which would be a misuse
147  // of this class.. it can be figured out from gdb as in either case it
148  // would be a bug in the code.
149  KALDI_ERR << "Attempt to access frame past the end of the available input";
150  }
151 
152 
153  CuMatrix<BaseFloat> feats_chunk;
154  { // this block sets 'feats_chunk'.
155  Matrix<BaseFloat> this_feats(end_input_frame - begin_input_frame,
156  input_features_->Dim());
157  for (int32 i = begin_input_frame; i < end_input_frame; i++) {
158  SubVector<BaseFloat> this_row(this_feats, i - begin_input_frame);
159  int32 input_frame = i;
160  if (input_frame < 0) input_frame = 0;
161  if (input_frame >= num_feature_frames_ready)
162  input_frame = num_feature_frames_ready - 1;
163  input_features_->GetFrame(input_frame, &this_row);
164  }
165  feats_chunk.Swap(&this_feats);
166  }
167  computer_.AcceptInput("input", &feats_chunk);
168 
169  if (info_.has_ivectors) {
171  KALDI_ASSERT(info_.request1.inputs.size() == 2);
172  // all but the 1st chunk should have 1 iVector, but there is no need to
173  // assume this.
174  int32 num_ivectors = (num_chunks_computed_ == 0 ?
175  info_.request1.inputs[1].indexes.size() :
176  info_.request2.inputs[1].indexes.size());
177  KALDI_ASSERT(num_ivectors > 0);
178 
179  Vector<BaseFloat> ivector(ivector_features_->Dim());
180  // we just get the iVector from the last input frame we needed,
181  // reduced as necessary
182  // we don't bother trying to be 'accurate' in getting the iVectors
183  // for their 'correct' frames, because in general using the
184  // iVector from as large 't' as possible will be better.
185 
186  int32 most_recent_input_frame = num_feature_frames_ready - 1,
187  num_ivector_frames_ready = ivector_features_->NumFramesReady();
188 
189  if (num_ivector_frames_ready > 0) {
190  int32 ivector_frame_to_use = std::min<int32>(
191  most_recent_input_frame, num_ivector_frames_ready - 1);
192  ivector_features_->GetFrame(ivector_frame_to_use,
193  &ivector);
194  }
195  // else just leave the iVector zero (would only happen with very small
196  // chunk-size, like a chunk size of 2 which would be very inefficient; and
197  // only at file begin.
198 
199  // note: we expect num_ivectors to be 1 in practice.
200  Matrix<BaseFloat> ivectors(num_ivectors,
201  ivector.Dim());
202  ivectors.CopyRowsFromVec(ivector);
203  CuMatrix<BaseFloat> cu_ivectors;
204  cu_ivectors.Swap(&ivectors);
205  computer_.AcceptInput("ivector", &cu_ivectors);
206  }
207  computer_.Run();
208 
209  {
210  // Note: it's possible in theory that if you had weird recurrence that went
211  // directly from the output, the call to GetOutputDestructive() would cause
212  // a crash on the next chunk. If that happens, GetOutput() should be used
213  // instead of GetOutputDestructive(). But we don't anticipate this will
214  // happen in practice.
215  CuMatrix<BaseFloat> output;
216  computer_.GetOutputDestructive("output", &output);
217 
218  if (info_.log_priors.Dim() != 0) {
219  // subtract log-prior (divide by prior)
220  output.AddVecToRows(-1.0, info_.log_priors);
221  }
222  // apply the acoustic scale
223  output.Scale(info_.opts.acoustic_scale);
225  current_log_post_.Swap(&output);
226  }
230 
232 
234  (num_chunks_computed_ - 1) *
236 }
MatrixIndexT NumCols() const
Returns number of columns (or zero for empty matrix).
Definition: kaldi-matrix.h:67
virtual void GetFrame(int32 frame, VectorBase< BaseFloat > *feat)=0
Gets the feature vector for this frame.
kaldi::int32 int32
std::vector< IoSpecification > inputs
void Swap(Matrix< Real > *other)
Swaps the contents of *this and *other. Shallow swap.
const NnetSimpleLoopedComputationOptions & opts
void AcceptInput(const std::string &node_name, CuMatrix< BaseFloat > *input)
e.g.
const DecodableNnetSimpleLoopedInfo & info_
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
void Resize(const MatrixIndexT r, const MatrixIndexT c, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Sets matrix to a specified size (zero is OK as long as both r and c are zero).
void GetOutputDestructive(const std::string &output_name, CuMatrix< BaseFloat > *output)
virtual int32 NumFramesReady() const =0
returns the feature dimension.
virtual int32 Dim() const =0
void Run()
This does either the forward or backward computation, depending when it is called (in a typical compu...

◆ EnsureFrameIsComputed()

void EnsureFrameIsComputed ( int32  subsampled_frame)
inlineprotected

If the neural-network outputs for this frame are not cached, this function computes them (and possibly also some later frames).

Note: the frame-index is called 'subsampled_frame' because if frame-subsampling-factor is not 1, it's an index that is "after subsampling", i.e. it changes more slowly than the input-feature index.

Definition at line 103 of file decodable-online-looped.h.

References DecodableNnetLoopedOnlineBase::AdvanceChunk(), DecodableNnetLoopedOnlineBase::current_log_post_, DecodableNnetLoopedOnlineBase::current_log_post_subsampled_offset_, KALDI_ASSERT, and MatrixBase< Real >::NumRows().

Referenced by DecodableNnetLoopedOnline::LogLikelihood(), and DecodableAmNnetLoopedOnline::LogLikelihood().

103  {
105  "Frames must be accessed in order.");
106  while (subsampled_frame >= current_log_post_subsampled_offset_ +
108  AdvanceChunk();
109  }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64

◆ FrameSubsamplingFactor()

◆ GetFrameOffset()

int32 GetFrameOffset ( ) const
inline

Returns the frame offset value.

Definition at line 94 of file decodable-online-looped.h.

References DecodableNnetLoopedOnlineBase::frame_offset_.

◆ IsLastFrame()

bool IsLastFrame ( int32  frame) const
virtual

Returns true if this is the last frame.

Frames are zero-based, so the first frame is zero. IsLastFrame(-1) will return false, unless the file is empty (which is a case that I'm not sure all the code will handle, so be careful). Caution: the behavior of this function in an online setting is being changed somewhat. In future it may return false in cases where we haven't yet decided to terminate decoding, but later true if we decide to terminate decoding. The plan in future is to rely more on NumFramesReady(), and in future, IsLastFrame() would always return false in an online-decoding setting, and would only return true in a decoding-from-matrix setting where we want to allow the last delta or LDA features to be flushed out for compatibility with the baseline setup.

Implements DecodableInterface.

Definition at line 89 of file decodable-online-looped.cc.

References DecodableNnetLoopedOnlineBase::frame_offset_, NnetSimpleLoopedComputationOptions::frame_subsampling_factor, DecodableNnetLoopedOnlineBase::info_, DecodableNnetLoopedOnlineBase::input_features_, OnlineFeatureInterface::IsLastFrame(), OnlineFeatureInterface::NumFramesReady(), and DecodableNnetSimpleLoopedInfo::opts.

90  {
91  // To understand this code, compare it with the code of NumFramesReady(),
92  // it follows the same structure.
93  int32 features_ready = input_features_->NumFramesReady();
94  if (features_ready == 0) {
95  if (subsampled_frame == -1 && input_features_->IsLastFrame(-1)) {
96  // the attempt to handle this rather pathological case (input finished
97  // but no frames ready) is a little quixotic as we have not properly
98  // tested this and other parts of the code may die.
99  return true;
100  } else {
101  return false;
102  }
103  }
104  bool input_finished = input_features_->IsLastFrame(features_ready - 1);
105  if (!input_finished)
106  return false;
108  num_subsampled_frames_ready = (features_ready + sf - 1) / sf;
109  return (subsampled_frame + frame_offset_ == num_subsampled_frames_ready - 1);
110 }
kaldi::int32 int32
const NnetSimpleLoopedComputationOptions & opts
const DecodableNnetSimpleLoopedInfo & info_
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.
virtual int32 NumFramesReady() const =0
returns the feature dimension.

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( DecodableNnetLoopedOnlineBase  )
private

◆ NumFramesReady()

int32 NumFramesReady ( ) const
virtual

The call NumFramesReady() will return the number of frames currently available for this decodable object.

This is for use in setups where you don't want the decoder to block while waiting for input. This is newly added as of Jan 2014, and I hope, going forward, to rely on this mechanism more than IsLastFrame to know when to stop decoding.

Reimplemented from DecodableInterface.

Definition at line 56 of file decodable-online-looped.cc.

References DecodableNnetLoopedOnlineBase::frame_offset_, NnetSimpleLoopedComputationOptions::frame_subsampling_factor, DecodableNnetSimpleLoopedInfo::frames_per_chunk, DecodableNnetSimpleLoopedInfo::frames_right_context, DecodableNnetLoopedOnlineBase::info_, DecodableNnetLoopedOnlineBase::input_features_, OnlineFeatureInterface::IsLastFrame(), OnlineFeatureInterface::NumFramesReady(), and DecodableNnetSimpleLoopedInfo::opts.

Referenced by DecodableNnetLoopedOnlineBase::SetFrameOffset().

56  {
57  // note: the ivector_features_ may have 2 or 3 fewer frames ready than
58  // input_features_, but we don't wait for them; we just use the most recent
59  // iVector we can.
60  int32 features_ready = input_features_->NumFramesReady();
61  if (features_ready == 0)
62  return 0;
63  bool input_finished = input_features_->IsLastFrame(features_ready - 1);
64 
66 
67  if (input_finished) {
68  // if the input has finished,... we'll pad with duplicates of the last frame
69  // as needed to get the required right context.
70  return (features_ready + sf - 1) / sf - frame_offset_;
71  } else {
72  // note: info_.right_context_ includes both the model context and any
73  // extra_right_context_ (but this
74  int32 non_subsampled_output_frames_ready =
75  std::max<int32>(0, features_ready - info_.frames_right_context);
76  int32 num_chunks_ready = non_subsampled_output_frames_ready /
78  // note: the division by the frame subsampling factor 'sf' below
79  // doesn't need any attention to rounding because info_.frames_per_chunk
80  // is always a multiple of 'sf' (see 'frames_per_chunk = GetChunksize..."
81  // in decodable-simple-looped.cc).
82  return num_chunks_ready * info_.frames_per_chunk / sf - frame_offset_;
83  }
84 }
kaldi::int32 int32
const NnetSimpleLoopedComputationOptions & opts
const DecodableNnetSimpleLoopedInfo & info_
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.
virtual int32 NumFramesReady() const =0
returns the feature dimension.

◆ SetFrameOffset()

void SetFrameOffset ( int32  frame_offset)

Sets the frame offset value.

Frame offset is initialized to 0 when the decodable object is constructed and stays as 0 unless this method is called. This method is useful when we want to reset the decoder state, i.e. call decoder.InitDecoding(), but we want to keep using the same decodable object, e.g. in case of an endpoint. The frame offset affects the behavior of IsLastFrame(), NumFramesReady() and LogLikelihood() methods.

Definition at line 112 of file decodable-online-looped.cc.

References DecodableNnetLoopedOnlineBase::frame_offset_, KALDI_ASSERT, and DecodableNnetLoopedOnlineBase::NumFramesReady().

Referenced by DecodableNnetLoopedOnlineBase::FrameSubsamplingFactor(), SingleUtteranceNnet3IncrementalDecoderTpl< FST >::InitDecoding(), and SingleUtteranceNnet3DecoderTpl< FST >::InitDecoding().

112  {
113  KALDI_ASSERT(0 <= frame_offset &&
114  frame_offset <= frame_offset_ + NumFramesReady());
115  frame_offset_ = frame_offset;
116 }
virtual int32 NumFramesReady() const
The call NumFramesReady() will return the number of frames currently available for this decodable obj...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

Member Data Documentation

◆ computer_

NnetComputer computer_
private

◆ current_log_post_

◆ current_log_post_subsampled_offset_

◆ frame_offset_

◆ info_

◆ input_features_

◆ ivector_features_

◆ num_chunks_computed_

int32 num_chunks_computed_
protected

The documentation for this class was generated from the following files: