All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
DecodableNnetLoopedOnlineBase Class Reference

#include <decodable-online-looped.h>

Inheritance diagram for DecodableNnetLoopedOnlineBase:
Collaboration diagram for DecodableNnetLoopedOnlineBase:

Public Member Functions

 DecodableNnetLoopedOnlineBase (const DecodableNnetSimpleLoopedInfo &info, OnlineFeatureInterface *input_features, OnlineFeatureInterface *ivector_features)
 
virtual bool IsLastFrame (int32 subsampled_frame) const
 Returns true if this is the last frame. More...
 
virtual int32 NumFramesReady () const
 The call NumFramesReady() will return the number of frames currently available for this decodable object. More...
 
int32 FrameSubsamplingFactor () const
 
- Public Member Functions inherited from DecodableInterface
virtual BaseFloat LogLikelihood (int32 frame, int32 index)=0
 Returns the log likelihood, which will be negated in the decoder. More...
 
virtual int32 NumIndices () const =0
 Returns the number of states in the acoustic model (they will be indexed one-based, i.e. More...
 
virtual ~DecodableInterface ()
 

Protected Member Functions

void EnsureFrameIsComputed (int32 subsampled_frame)
 If the neural-network outputs for this frame are not cached, this function computes them (and possibly also some later frames). More...
 

Protected Attributes

Matrix< BaseFloatcurrent_log_post_
 
int32 num_chunks_computed_
 
int32 current_log_post_subsampled_offset_
 
const
DecodableNnetSimpleLoopedInfo
info_
 

Private Member Functions

void AdvanceChunk ()
 
 KALDI_DISALLOW_COPY_AND_ASSIGN (DecodableNnetLoopedOnlineBase)
 

Private Attributes

OnlineFeatureInterfaceinput_features_
 
OnlineFeatureInterfaceivector_features_
 
NnetComputer computer_
 

Detailed Description

Definition at line 56 of file decodable-online-looped.h.

Constructor & Destructor Documentation

DecodableNnetLoopedOnlineBase ( const DecodableNnetSimpleLoopedInfo info,
OnlineFeatureInterface input_features,
OnlineFeatureInterface ivector_features 
)

Definition at line 26 of file decodable-online-looped.cc.

References OnlineFeatureInterface::Dim(), DecodableNnetLoopedOnlineBase::info_, DecodableNnetLoopedOnlineBase::input_features_, Nnet::InputDim(), DecodableNnetLoopedOnlineBase::ivector_features_, KALDI_ASSERT, KALDI_ERR, and DecodableNnetSimpleLoopedInfo::nnet.

29  :
32  info_(info),
33  input_features_(input_features),
34  ivector_features_(ivector_features),
36  info_.nnet, NULL) { // NULL is 'nnet_to_update'
37  // Check that feature dimensions match.
39  int32 nnet_input_dim = info_.nnet.InputDim("input"),
40  nnet_ivector_dim = info_.nnet.InputDim("ivector"),
41  feat_input_dim = input_features_->Dim(),
42  feat_ivector_dim = (ivector_features_ != NULL ?
43  ivector_features_->Dim() : -1);
44  if (nnet_input_dim != feat_input_dim) {
45  KALDI_ERR << "Input feature dimension mismatch: got " << feat_input_dim
46  << " but network expects " << nnet_input_dim;
47  }
48  if (nnet_ivector_dim != feat_ivector_dim) {
49  KALDI_ERR << "Ivector feature dimension mismatch: got " << feat_ivector_dim
50  << " but network expects " << nnet_ivector_dim;
51  }
52 }
int32 InputDim(const std::string &input_name) const
Definition: nnet-nnet.cc:669
const NnetSimpleLoopedComputationOptions & opts
const DecodableNnetSimpleLoopedInfo & info_
#define KALDI_ERR
Definition: kaldi-error.h:127
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
virtual int32 Dim() const =0

Member Function Documentation

void AdvanceChunk ( )
private

Definition at line 112 of file decodable-online-looped.cc.

References NnetComputer::AcceptInput(), NnetSimpleLoopedComputationOptions::acoustic_scale, CuMatrixBase< Real >::AddVecToRows(), DecodableNnetLoopedOnlineBase::computer_, MatrixBase< Real >::CopyRowsFromVec(), DecodableNnetLoopedOnlineBase::current_log_post_, DecodableNnetLoopedOnlineBase::current_log_post_subsampled_offset_, OnlineFeatureInterface::Dim(), CuVectorBase< Real >::Dim(), NnetSimpleLoopedComputationOptions::frame_subsampling_factor, DecodableNnetSimpleLoopedInfo::frames_left_context, DecodableNnetSimpleLoopedInfo::frames_per_chunk, DecodableNnetSimpleLoopedInfo::frames_right_context, OnlineFeatureInterface::GetFrame(), NnetComputer::GetOutputDestructive(), DecodableNnetSimpleLoopedInfo::has_ivectors, rnnlm::i, DecodableNnetLoopedOnlineBase::info_, DecodableNnetLoopedOnlineBase::input_features_, ComputationRequest::inputs, OnlineFeatureInterface::IsLastFrame(), DecodableNnetLoopedOnlineBase::ivector_features_, KALDI_ASSERT, KALDI_ERR, DecodableNnetSimpleLoopedInfo::log_priors, DecodableNnetLoopedOnlineBase::num_chunks_computed_, MatrixBase< Real >::NumCols(), OnlineFeatureInterface::NumFramesReady(), MatrixBase< Real >::NumRows(), DecodableNnetSimpleLoopedInfo::opts, DecodableNnetSimpleLoopedInfo::output_dim, DecodableNnetSimpleLoopedInfo::request1, DecodableNnetSimpleLoopedInfo::request2, Matrix< Real >::Resize(), NnetComputer::Run(), CuMatrixBase< Real >::Scale(), CuMatrix< Real >::Swap(), and Matrix< Real >::Swap().

Referenced by DecodableNnetLoopedOnlineBase::EnsureFrameIsComputed().

112  {
113  // Prepare the input data for the next chunk of features.
114  // note: 'end' means one past the last.
115  int32 begin_input_frame, end_input_frame;
116  if (num_chunks_computed_ == 0) {
117  begin_input_frame = -info_.frames_left_context;
118  // note: end is last plus one.
119  end_input_frame = info_.frames_per_chunk + info_.frames_right_context;
120  } else {
121  // note: begin_input_frame will be the same as the previous end_input_frame.
122  // you can verify this directly if num_chunks_computed_ == 0, and then by
123  // induction.
124  begin_input_frame = num_chunks_computed_ * info_.frames_per_chunk +
126  end_input_frame = begin_input_frame + info_.frames_per_chunk;
127  }
128 
129  int32 num_feature_frames_ready = input_features_->NumFramesReady();
130  bool is_finished = input_features_->IsLastFrame(num_feature_frames_ready - 1);
131 
132  if (end_input_frame > num_feature_frames_ready && !is_finished) {
133  // we shouldn't be attempting to read past the end of the available features
134  // until we have reached the end of the input (i.e. the end-user called
135  // InputFinished(), announcing that there is no more waveform; at this point
136  // we pad as needed with copies of the last frame, to flush out the last of
137  // the output.
138  // If the following error happens, it likely indicates a bug in this
139  // decodable code somewhere (although it could possibly indicate the
140  // user asking for a frame that was not ready, which would be a misuse
141  // of this class.. it can be figured out from gdb as in either case it
142  // would be a bug in the code.
143  KALDI_ERR << "Attempt to access frame past the end of the available input";
144  }
145 
146 
147  CuMatrix<BaseFloat> feats_chunk;
148  { // this block sets 'feats_chunk'.
149  Matrix<BaseFloat> this_feats(end_input_frame - begin_input_frame,
150  input_features_->Dim());
151  for (int32 i = begin_input_frame; i < end_input_frame; i++) {
152  SubVector<BaseFloat> this_row(this_feats, i - begin_input_frame);
153  int32 input_frame = i;
154  if (input_frame < 0) input_frame = 0;
155  if (input_frame >= num_feature_frames_ready)
156  input_frame = num_feature_frames_ready - 1;
157  input_features_->GetFrame(input_frame, &this_row);
158  }
159  feats_chunk.Swap(&this_feats);
160  }
161  computer_.AcceptInput("input", &feats_chunk);
162 
163  if (info_.has_ivectors) {
165  KALDI_ASSERT(info_.request1.inputs.size() == 2);
166  // all but the 1st chunk should have 1 iVector, but there is no need to
167  // assume this.
168  int32 num_ivectors = (num_chunks_computed_ == 0 ?
169  info_.request1.inputs[1].indexes.size() :
170  info_.request2.inputs[1].indexes.size());
171  KALDI_ASSERT(num_ivectors > 0);
172 
173  Vector<BaseFloat> ivector(ivector_features_->Dim());
174  // we just get the iVector from the last input frame we needed,
175  // reduced as necessary
176  // we don't bother trying to be 'accurate' in getting the iVectors
177  // for their 'correct' frames, because in general using the
178  // iVector from as large 't' as possible will be better.
179 
180  int32 most_recent_input_frame = num_feature_frames_ready - 1,
181  num_ivector_frames_ready = ivector_features_->NumFramesReady();
182 
183  if (num_ivector_frames_ready > 0) {
184  int32 ivector_frame_to_use = std::min<int32>(
185  most_recent_input_frame, num_ivector_frames_ready - 1);
186  ivector_features_->GetFrame(ivector_frame_to_use,
187  &ivector);
188  }
189  // else just leave the iVector zero (would only happen with very small
190  // chunk-size, like a chunk size of 2 which would be very inefficient; and
191  // only at file begin.
192 
193  // note: we expect num_ivectors to be 1 in practice.
194  Matrix<BaseFloat> ivectors(num_ivectors,
195  ivector.Dim());
196  ivectors.CopyRowsFromVec(ivector);
197  CuMatrix<BaseFloat> cu_ivectors;
198  cu_ivectors.Swap(&ivectors);
199  computer_.AcceptInput("ivector", &cu_ivectors);
200  }
201  computer_.Run();
202 
203  {
204  // Note: it's possible in theory that if you had weird recurrence that went
205  // directly from the output, the call to GetOutputDestructive() would cause
206  // a crash on the next chunk. If that happens, GetOutput() should be used
207  // instead of GetOutputDestructive(). But we don't anticipate this will
208  // happen in practice.
209  CuMatrix<BaseFloat> output;
210  computer_.GetOutputDestructive("output", &output);
211 
212  if (info_.log_priors.Dim() != 0) {
213  // subtract log-prior (divide by prior)
214  output.AddVecToRows(-1.0, info_.log_priors);
215  }
216  // apply the acoustic scale
217  output.Scale(info_.opts.acoustic_scale);
219  current_log_post_.Swap(&output);
220  }
224 
226 
228  (num_chunks_computed_ - 1) *
230 }
virtual void GetFrame(int32 frame, VectorBase< BaseFloat > *feat)=0
Gets the feature vector for this frame.
std::vector< IoSpecification > inputs
void Swap(Matrix< Real > *other)
Swaps the contents of *this and *other. Shallow swap.
const NnetSimpleLoopedComputationOptions & opts
void AcceptInput(const std::string &node_name, CuMatrix< BaseFloat > *input)
e.g.
MatrixIndexT Dim() const
Dimensions.
Definition: cu-vector.h:67
const DecodableNnetSimpleLoopedInfo & info_
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.
#define KALDI_ERR
Definition: kaldi-error.h:127
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
Definition: kaldi-matrix.h:58
MatrixIndexT NumCols() const
Returns number of columns (or zero for emtpy matrix).
Definition: kaldi-matrix.h:61
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
void Resize(const MatrixIndexT r, const MatrixIndexT c, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Sets matrix to a specified size (zero is OK as long as both r and c are zero).
void GetOutputDestructive(const std::string &output_name, CuMatrix< BaseFloat > *output)
virtual int32 NumFramesReady() const =0
returns the feature dimension.
virtual int32 Dim() const =0
void Run()
This does either the forward or backward computation, depending when it is called (in a typical compu...
void EnsureFrameIsComputed ( int32  subsampled_frame)
inlineprotected

If the neural-network outputs for this frame are not cached, this function computes them (and possibly also some later frames).

Note: the frame-index is called 'subsampled_frame' because if frame-subsampling-factor is not 1, it's an index that is "after subsampling", i.e. it changes more slowly than the input-feature index.

Definition at line 92 of file decodable-online-looped.h.

References DecodableNnetLoopedOnlineBase::AdvanceChunk(), DecodableNnetLoopedOnlineBase::current_log_post_, DecodableNnetLoopedOnlineBase::current_log_post_subsampled_offset_, KALDI_ASSERT, and MatrixBase< Real >::NumRows().

Referenced by DecodableNnetLoopedOnline::LogLikelihood(), and DecodableAmNnetLoopedOnline::LogLikelihood().

92  {
94  "Frames must be accessed in order.");
95  while (subsampled_frame >= current_log_post_subsampled_offset_ +
97  AdvanceChunk();
98  }
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
Definition: kaldi-matrix.h:58
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:169
bool IsLastFrame ( int32  frame) const
virtual

Returns true if this is the last frame.

Frames are zero-based, so the first frame is zero. IsLastFrame(-1) will return false, unless the file is empty (which is a case that I'm not sure all the code will handle, so be careful). Caution: the behavior of this function in an online setting is being changed somewhat. In future it may return false in cases where we haven't yet decided to terminate decoding, but later true if we decide to terminate decoding. The plan in future is to rely more on NumFramesReady(), and in future, IsLastFrame() would always return false in an online-decoding setting, and would only return true in a decoding-from-matrix setting where we want to allow the last delta or LDA features to be flushed out for compatibility with the baseline setup.

Implements DecodableInterface.

Definition at line 88 of file decodable-online-looped.cc.

References NnetSimpleLoopedComputationOptions::frame_subsampling_factor, DecodableNnetLoopedOnlineBase::info_, DecodableNnetLoopedOnlineBase::input_features_, OnlineFeatureInterface::IsLastFrame(), OnlineFeatureInterface::NumFramesReady(), and DecodableNnetSimpleLoopedInfo::opts.

89  {
90  // To understand this code, compare it with the code of NumFramesReady(),
91  // it follows the same structure.
92  int32 features_ready = input_features_->NumFramesReady();
93  if (features_ready == 0) {
94  if (subsampled_frame == -1 && input_features_->IsLastFrame(-1)) {
95  // the attempt to handle this rather pathological case (input finished
96  // but no frames ready) is a little quixotic as we have not properly
97  // tested this and other parts of the code may die.
98  return true;
99  } else {
100  return false;
101  }
102  }
103  bool input_finished = input_features_->IsLastFrame(features_ready - 1);
104  if (!input_finished)
105  return false;
107  num_subsampled_frames_ready = (features_ready + sf - 1) / sf;
108  return (subsampled_frame == num_subsampled_frames_ready - 1);
109 }
const NnetSimpleLoopedComputationOptions & opts
const DecodableNnetSimpleLoopedInfo & info_
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.
virtual int32 NumFramesReady() const =0
returns the feature dimension.
KALDI_DISALLOW_COPY_AND_ASSIGN ( DecodableNnetLoopedOnlineBase  )
private
int32 NumFramesReady ( ) const
virtual

The call NumFramesReady() will return the number of frames currently available for this decodable object.

This is for use in setups where you don't want the decoder to block while waiting for input. This is newly added as of Jan 2014, and I hope, going forward, to rely on this mechanism more than IsLastFrame to know when to stop decoding.

Reimplemented from DecodableInterface.

Definition at line 55 of file decodable-online-looped.cc.

References NnetSimpleLoopedComputationOptions::frame_subsampling_factor, DecodableNnetSimpleLoopedInfo::frames_per_chunk, DecodableNnetSimpleLoopedInfo::frames_right_context, DecodableNnetLoopedOnlineBase::info_, DecodableNnetLoopedOnlineBase::input_features_, OnlineFeatureInterface::IsLastFrame(), OnlineFeatureInterface::NumFramesReady(), and DecodableNnetSimpleLoopedInfo::opts.

55  {
56  // note: the ivector_features_ may have 2 or 3 fewer frames ready than
57  // input_features_, but we don't wait for them; we just use the most recent
58  // iVector we can.
59  int32 features_ready = input_features_->NumFramesReady();
60  if (features_ready == 0)
61  return 0;
62  bool input_finished = input_features_->IsLastFrame(features_ready - 1);
63 
65 
66  if (input_finished) {
67  // if the input has finished,... we'll pad with duplicates of the last frame
68  // as needed to get the required right context.
69  return (features_ready + sf - 1) / sf;
70  } else {
71  // note: info_.right_context_ includes both the model context and any
72  // extra_right_context_ (but this
73  int32 non_subsampled_output_frames_ready =
74  std::max<int32>(0, features_ready - info_.frames_right_context);
75  int32 num_chunks_ready = non_subsampled_output_frames_ready /
77  // note: the division by the frame subsampling factor 'sf' below
78  // doesn't need any attention to rounding because info_.frames_per_chunk
79  // is always a multiple of 'sf' (see 'frames_per_chunk = GetChunksize..."
80  // in decodable-simple-looped.cc).
81  return num_chunks_ready * info_.frames_per_chunk / sf;
82  }
83 }
const NnetSimpleLoopedComputationOptions & opts
const DecodableNnetSimpleLoopedInfo & info_
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.
virtual int32 NumFramesReady() const =0
returns the feature dimension.

Member Data Documentation

NnetComputer computer_
private
int32 num_chunks_computed_
protected

The documentation for this class was generated from the following files: