DecodableNnet2Online Class Reference

This Decodable object for class nnet2::AmNnet takes feature input from class OnlineFeatureInterface, unlike, say, class DecodableAmNnet which takes feature input from a matrix. More...

#include <online-nnet2-decodable.h>

Inheritance diagram for DecodableNnet2Online:
Collaboration diagram for DecodableNnet2Online:

Public Member Functions

 DecodableNnet2Online (const AmNnet &nnet, const TransitionModel &trans_model, const DecodableNnet2OnlineOptions &opts, OnlineFeatureInterface *input_feats)
 
virtual BaseFloat LogLikelihood (int32 frame, int32 index)
 Returns the scaled log likelihood. More...
 
virtual bool IsLastFrame (int32 frame) const
 Returns true if this is the last frame. More...
 
virtual int32 NumFramesReady () const
 The call NumFramesReady() will return the number of frames currently available for this decodable object. More...
 
virtual int32 NumIndices () const
 Indices are one-based! This is for compatibility with OpenFst. More...
 
- Public Member Functions inherited from DecodableInterface
virtual ~DecodableInterface ()
 

Private Member Functions

void ComputeForFrame (int32 frame)
 If the neural-network outputs for this frame are not cached, it computes them (and possibly for some succeeding frames) More...
 
 KALDI_DISALLOW_COPY_AND_ASSIGN (DecodableNnet2Online)
 

Private Attributes

OnlineFeatureInterfacefeatures_
 
const AmNnetnnet_
 
const TransitionModeltrans_model_
 
DecodableNnet2OnlineOptions opts_
 
CuVector< BaseFloatlog_priors_
 
int32 feat_dim_
 
int32 left_context_
 
int32 right_context_
 
int32 num_pdfs_
 
int32 begin_frame_
 
Matrix< BaseFloatscaled_loglikes_
 

Detailed Description

This Decodable object for class nnet2::AmNnet takes feature input from class OnlineFeatureInterface, unlike, say, class DecodableAmNnet which takes feature input from a matrix.

Definition at line 68 of file online-nnet2-decodable.h.

Constructor & Destructor Documentation

◆ DecodableNnet2Online()

DecodableNnet2Online ( const AmNnet nnet,
const TransitionModel trans_model,
const DecodableNnet2OnlineOptions opts,
OnlineFeatureInterface input_feats 
)

Definition at line 25 of file online-nnet2-decodable.cc.

References KALDI_ASSERT, DecodableNnet2Online::log_priors_, DecodableNnet2OnlineOptions::max_nnet_batch_size, DecodableNnet2Online::nnet_, TransitionModel::NumPdfs(), DecodableNnet2Online::opts_, AmNnet::Priors(), and DecodableNnet2Online::trans_model_.

29  :
30  features_(input_feats),
31  nnet_(nnet),
32  trans_model_(trans_model),
33  opts_(opts),
34  feat_dim_(input_feats->Dim()),
35  left_context_(nnet.GetNnet().LeftContext()),
36  right_context_(nnet.GetNnet().RightContext()),
37  num_pdfs_(nnet.GetNnet().OutputDim()),
38  begin_frame_(-1) {
42  "Priors in neural network not set up (or mismatch "
43  "with transition model).");
44  log_priors_.ApplyLog();
45 }
const VectorBase< BaseFloat > & Priors() const
Definition: am-nnet.h:67
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

Member Function Documentation

◆ ComputeForFrame()

void ComputeForFrame ( int32  frame)
private

If the neural-network outputs for this frame are not cached, it computes them (and possibly for some succeeding frames)

Definition at line 81 of file online-nnet2-decodable.cc.

References DecodableNnet2OnlineOptions::acoustic_scale, CuMatrixBase< Real >::AddVecToRows(), CuMatrixBase< Real >::ApplyFloor(), CuMatrixBase< Real >::ApplyLog(), DecodableNnet2Online::begin_frame_, DecodableNnet2Online::feat_dim_, DecodableNnet2Online::features_, OnlineFeatureInterface::GetFrame(), AmNnet::GetNnet(), OnlineFeatureInterface::IsLastFrame(), KALDI_ASSERT, DecodableNnet2Online::left_context_, DecodableNnet2Online::log_priors_, DecodableNnet2OnlineOptions::max_nnet_batch_size, DecodableNnet2Online::nnet_, kaldi::nnet2::NnetComputation(), DecodableNnet2Online::num_pdfs_, OnlineFeatureInterface::NumFramesReady(), DecodableNnet2Online::NumFramesReady(), MatrixBase< Real >::NumRows(), DecodableNnet2Online::opts_, DecodableNnet2OnlineOptions::pad_input, Matrix< Real >::Resize(), DecodableNnet2Online::right_context_, CuMatrixBase< Real >::Scale(), DecodableNnet2Online::scaled_loglikes_, and CuMatrix< Real >::Swap().

Referenced by DecodableNnet2Online::LogLikelihood().

81  {
82  int32 features_ready = features_->NumFramesReady();
83  bool input_finished = features_->IsLastFrame(features_ready - 1);
84  KALDI_ASSERT(frame >= 0);
85  if (frame >= begin_frame_ &&
87  return;
88  KALDI_ASSERT(frame < NumFramesReady());
89 
90  int32 input_frame_begin;
91  if (opts_.pad_input)
92  input_frame_begin = frame - left_context_;
93  else
94  input_frame_begin = frame;
95  int32 max_possible_input_frame_end = features_ready;
96  if (input_finished && opts_.pad_input)
97  max_possible_input_frame_end += right_context_;
98  int32 input_frame_end = std::min<int32>(max_possible_input_frame_end,
99  input_frame_begin +
100  left_context_ + right_context_ +
102  KALDI_ASSERT(input_frame_end > input_frame_begin);
103  Matrix<BaseFloat> features(input_frame_end - input_frame_begin,
104  feat_dim_);
105  for (int32 t = input_frame_begin; t < input_frame_end; t++) {
106  SubVector<BaseFloat> row(features, t - input_frame_begin);
107  int32 t_modified = t;
108  // The next two if-statements take care of "pad_input"
109  if (t_modified < 0)
110  t_modified = 0;
111  if (t_modified >= features_ready)
112  t_modified = features_ready - 1;
113  features_->GetFrame(t_modified, &row);
114  }
115  CuMatrix<BaseFloat> cu_features;
116  cu_features.Swap(&features); // Copy to GPU, if we're using one.
117 
118 
119  int32 num_frames_out = input_frame_end - input_frame_begin -
120  left_context_ - right_context_;
121 
122  CuMatrix<BaseFloat> cu_posteriors(num_frames_out, num_pdfs_);
123 
124  // The "false" below tells it not to pad the input: we've already done
125  // any padding that we needed to do.
126  NnetComputation(nnet_.GetNnet(), cu_features,
127  false, &cu_posteriors);
128 
129  cu_posteriors.ApplyFloor(1.0e-20); // Avoid log of zero which leads to NaN.
130  cu_posteriors.ApplyLog();
131  // subtract log-prior (divide by prior)
132  cu_posteriors.AddVecToRows(-1.0, log_priors_);
133  // apply probability scale.
134  cu_posteriors.Scale(opts_.acoustic_scale);
135 
136  // Transfer the scores the CPU for faster access by the
137  // decoding process.
138  scaled_loglikes_.Resize(0, 0);
139  cu_posteriors.Swap(&scaled_loglikes_);
140 
141  begin_frame_ = frame;
142 }
virtual void GetFrame(int32 frame, VectorBase< BaseFloat > *feat)=0
Gets the feature vector for this frame.
kaldi::int32 int32
void NnetComputation(const Nnet &nnet, const CuMatrixBase< BaseFloat > &input, bool pad_input, CuMatrixBase< BaseFloat > *output)
Does the basic neural net computation, on a sequence of data (e.g.
virtual int32 NumFramesReady() const
The call NumFramesReady() will return the number of frames currently available for this decodable obj...
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64
void Resize(const MatrixIndexT r, const MatrixIndexT c, MatrixResizeType resize_type=kSetZero, MatrixStrideType stride_type=kDefaultStride)
Sets matrix to a specified size (zero is OK as long as both r and c are zero).
virtual int32 NumFramesReady() const =0
returns the feature dimension.
const Nnet & GetNnet() const
Definition: am-nnet.h:61

◆ IsLastFrame()

bool IsLastFrame ( int32  frame) const
virtual

Returns true if this is the last frame.

Frames are zero-based, so the first frame is zero. IsLastFrame(-1) will return false, unless the file is empty (which is a case that I'm not sure all the code will handle, so be careful). Caution: the behavior of this function in an online setting is being changed somewhat. In future it may return false in cases where we haven't yet decided to terminate decoding, but later true if we decide to terminate decoding. The plan in future is to rely more on NumFramesReady(), and in future, IsLastFrame() would always return false in an online-decoding setting, and would only return true in a decoding-from-matrix setting where we want to allow the last delta or LDA features to be flushed out for compatibility with the baseline setup.

Implements DecodableInterface.

Definition at line 58 of file online-nnet2-decodable.cc.

References DecodableNnet2Online::features_, OnlineFeatureInterface::IsLastFrame(), DecodableNnet2Online::left_context_, DecodableNnet2Online::opts_, DecodableNnet2OnlineOptions::pad_input, and DecodableNnet2Online::right_context_.

58  {
59  if (opts_.pad_input) { // normal case
60  return features_->IsLastFrame(frame);
61  } else {
63  }
64 }
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.

◆ KALDI_DISALLOW_COPY_AND_ASSIGN()

KALDI_DISALLOW_COPY_AND_ASSIGN ( DecodableNnet2Online  )
private

◆ LogLikelihood()

BaseFloat LogLikelihood ( int32  frame,
int32  index 
)
virtual

Returns the scaled log likelihood.

Implements DecodableInterface.

Definition at line 49 of file online-nnet2-decodable.cc.

References DecodableNnet2Online::begin_frame_, DecodableNnet2Online::ComputeForFrame(), KALDI_ASSERT, MatrixBase< Real >::NumRows(), DecodableNnet2Online::scaled_loglikes_, DecodableNnet2Online::trans_model_, and TransitionModel::TransitionIdToPdf().

Referenced by kaldi::nnet2::UnitTestNnetDecodable().

49  {
50  ComputeForFrame(frame);
51  int32 pdf_id = trans_model_.TransitionIdToPdf(index);
52  KALDI_ASSERT(frame >= begin_frame_ &&
54  return scaled_loglikes_(frame - begin_frame_, pdf_id);
55 }
kaldi::int32 int32
int32 TransitionIdToPdf(int32 trans_id) const
void ComputeForFrame(int32 frame)
If the neural-network outputs for this frame are not cached, it computes them (and possibly for some ...
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
MatrixIndexT NumRows() const
Returns number of rows (or zero for empty matrix).
Definition: kaldi-matrix.h:64

◆ NumFramesReady()

int32 NumFramesReady ( ) const
virtual

The call NumFramesReady() will return the number of frames currently available for this decodable object.

This is for use in setups where you don't want the decoder to block while waiting for input. This is newly added as of Jan 2014, and I hope, going forward, to rely on this mechanism more than IsLastFrame to know when to stop decoding.

Reimplemented from DecodableInterface.

Definition at line 66 of file online-nnet2-decodable.cc.

References DecodableNnet2Online::features_, OnlineFeatureInterface::IsLastFrame(), DecodableNnet2Online::left_context_, OnlineFeatureInterface::NumFramesReady(), DecodableNnet2Online::opts_, DecodableNnet2OnlineOptions::pad_input, and DecodableNnet2Online::right_context_.

Referenced by DecodableNnet2Online::ComputeForFrame(), and kaldi::nnet2::UnitTestNnetDecodable().

66  {
67  int32 features_ready = features_->NumFramesReady();
68  if (features_ready == 0)
69  return 0;
70  bool input_finished = features_->IsLastFrame(features_ready - 1);
71  if (opts_.pad_input) {
72  // normal case... we'll pad with duplicates of first + last frame to get the
73  // required left and right context.
74  if (input_finished) return features_ready;
75  else return std::max<int32>(0, features_ready - right_context_);
76  } else {
77  return std::max<int32>(0, features_ready - right_context_ - left_context_);
78  }
79 }
kaldi::int32 int32
virtual bool IsLastFrame(int32 frame) const =0
Returns true if this is the last frame.
virtual int32 NumFramesReady() const =0
returns the feature dimension.

◆ NumIndices()

virtual int32 NumIndices ( ) const
inlinevirtual

Indices are one-based! This is for compatibility with OpenFst.

Implements DecodableInterface.

Definition at line 84 of file online-nnet2-decodable.h.

84 { return trans_model_.NumTransitionIds(); }
int32 NumTransitionIds() const
Returns the total number of transition-ids (note, these are one-based).

Member Data Documentation

◆ begin_frame_

int32 begin_frame_
private

◆ feat_dim_

int32 feat_dim_
private

Definition at line 97 of file online-nnet2-decodable.h.

Referenced by DecodableNnet2Online::ComputeForFrame().

◆ features_

◆ left_context_

◆ log_priors_

◆ nnet_

◆ num_pdfs_

int32 num_pdfs_
private

Definition at line 100 of file online-nnet2-decodable.h.

Referenced by DecodableNnet2Online::ComputeForFrame().

◆ opts_

◆ right_context_

◆ scaled_loglikes_

Matrix<BaseFloat> scaled_loglikes_
private

◆ trans_model_

const TransitionModel& trans_model_
private

The documentation for this class was generated from the following files: