PitchFrameInfo Class Reference
Collaboration diagram for PitchFrameInfo:

Classes

struct  StateInfo
 

Public Member Functions

void Cleanup (PitchFrameInfo *prev_frame)
 This function resizes the arrays for this object and updates the reference counts for the previous object (by decrementing those reference counts when we destroy a StateInfo object). More...
 
void SetBestState (int32 best_state, std::vector< std::pair< int32, BaseFloat > > &lag_nccf)
 This function may be called for the last (most recent) PitchFrameInfo object with the best state (obtained from the externally held forward-costs). More...
 
int32 ComputeLatency (int32 max_latency)
 This function may be called on the last (most recent) PitchFrameInfo object; it computes how many frames of latency there is because the traceback has not yet settled on a single value for frames in the past. More...
 
bool UpdatePreviousBestState (PitchFrameInfo *prev_frame)
 This function updates. More...
 
 PitchFrameInfo (int32 num_states)
 This constructor is used for frame -1; it sets the costs to be all zeros the pov_nccf's to zero and the backpointers to -1. More...
 
 PitchFrameInfo (PitchFrameInfo *prev)
 This constructor is used for subsequent frames (not -1). More...
 
void SetNccfPov (const VectorBase< BaseFloat > &nccf_pov)
 Record the nccf_pov value. More...
 
void ComputeBacktraces (const PitchExtractionOptions &opts, const VectorBase< BaseFloat > &nccf_pitch, const VectorBase< BaseFloat > &lags, const VectorBase< BaseFloat > &prev_forward_cost, std::vector< std::pair< int32, int32 > > *index_info, VectorBase< BaseFloat > *this_forward_cost)
 This constructor is used for frames apart from frame -1; the bulk of the Viterbi computation takes place inside this constructor. More...
 

Private Attributes

std::vector< StateInfostate_info_
 
int32 state_offset_
 the state index of the first entry in "state_info"; this will initially be zero, but after cleanup might be nonzero. More...
 
int32 cur_best_state_
 The current best state in the backtrace from the end. More...
 
PitchFrameInfoprev_info_
 The structure for the previous frame. More...
 

Detailed Description

Definition at line 198 of file pitch-functions.cc.

Constructor & Destructor Documentation

◆ PitchFrameInfo() [1/2]

PitchFrameInfo ( int32  num_states)
explicit

This constructor is used for frame -1; it sets the costs to be all zeros the pov_nccf's to zero and the backpointers to -1.

Definition at line 287 of file pitch-functions.cc.

288  :state_info_(num_states), state_offset_(0),
289  cur_best_state_(-1), prev_info_(NULL) { }
std::vector< StateInfo > state_info_
int32 cur_best_state_
The current best state in the backtrace from the end.
PitchFrameInfo * prev_info_
The structure for the previous frame.
int32 state_offset_
the state index of the first entry in "state_info"; this will initially be zero, but after cleanup mi...

◆ PitchFrameInfo() [2/2]

This constructor is used for subsequent frames (not -1).

Definition at line 295 of file pitch-functions.cc.

295  :
296  state_info_(prev_info->state_info_.size()), state_offset_(0),
297  cur_best_state_(-1), prev_info_(prev_info) { }
std::vector< StateInfo > state_info_
int32 cur_best_state_
The current best state in the backtrace from the end.
PitchFrameInfo * prev_info_
The structure for the previous frame.
int32 state_offset_
the state index of the first entry in "state_info"; this will initially be zero, but after cleanup mi...

Member Function Documentation

◆ Cleanup()

void Cleanup ( PitchFrameInfo prev_frame)

This function resizes the arrays for this object and updates the reference counts for the previous object (by decrementing those reference counts when we destroy a StateInfo object).

A StateInfo object is considered to be destroyed when we delete it, not when its reference counts goes to zero.

Definition at line 546 of file pitch-functions.cc.

References KALDI_ERR.

546  {
547  KALDI_ERR << "Cleanup not implemented.";
548 }
#define KALDI_ERR
Definition: kaldi-error.h:147

◆ ComputeBacktraces()

void ComputeBacktraces ( const PitchExtractionOptions opts,
const VectorBase< BaseFloat > &  nccf_pitch,
const VectorBase< BaseFloat > &  lags,
const VectorBase< BaseFloat > &  prev_forward_cost,
std::vector< std::pair< int32, int32 > > *  index_info,
VectorBase< BaseFloat > *  this_forward_cost 
)

This constructor is used for frames apart from frame -1; the bulk of the Viterbi computation takes place inside this constructor.

Parameters
optsThe options as provided by the user
nccf_pitchThe nccf as computed for the pitch computation (with ballast).
nccf_povThe nccf as computed for the POV computation (without ballast).
lagsThe log-spaced lags at which nccf_pitch and nccf_pov are sampled.
prev_frame_forward_costThe forward-cost vector for the previous frame.
index_infoA pointer to a temporary vector used by this function
this_forward_costThe forward-cost vector for this frame (to be computed).

Definition at line 306 of file pitch-functions.cc.

References VectorBase< Real >::AddVec(), kaldi::ComputeLocalCost(), PitchFrameInfo::cur_best_state_, VectorBase< Real >::Data(), PitchExtractionOptions::delta_pitch, VectorBase< Real >::Dim(), rnnlm::i, rnnlm::j, kaldi::kUndefined, kaldi::Log(), PitchExtractionOptions::penalty_factor, and PitchFrameInfo::state_info_.

312  {
313  int32 num_states = nccf_pitch.Dim();
314 
315  Vector<BaseFloat> local_cost(num_states, kUndefined);
316  ComputeLocalCost(nccf_pitch, lags, opts, &local_cost);
317 
318  const BaseFloat delta_pitch_sq = pow(Log(1.0 + opts.delta_pitch), 2.0),
319  inter_frame_factor = delta_pitch_sq * opts.penalty_factor;
320 
321  // index local_cost, prev_forward_cost and this_forward_cost using raw pointer
322  // indexing not operator (), since this is the very inner loop and a lot of
323  // time is taken here.
324  const BaseFloat *prev_forward_cost = prev_forward_cost_vec.Data();
325  BaseFloat *this_forward_cost = this_forward_cost_vec->Data();
326 
327  if (index_info->empty())
328  index_info->resize(num_states);
329 
330  // make it a reference for more concise indexing.
331  std::vector<std::pair<int32, int32> > &bounds = *index_info;
332 
333  /* bounds[i].first will be a lower bound on the backpointer for state i,
334  bounds[i].second will be an upper bound on it. We progressively tighten
335  these bounds till we know the backpointers exactly.
336  */
337 
339  // This branch is only taken in unit-testing code.
340  for (int32 i = 0; i < num_states; i++) {
341  BaseFloat best_cost = std::numeric_limits<BaseFloat>::infinity();
342  int32 best_j = -1;
343  for (int32 j = 0; j < num_states; j++) {
344  BaseFloat this_cost = (j - i) * (j - i) * inter_frame_factor
345  + prev_forward_cost[j];
346  if (this_cost < best_cost) {
347  best_cost = this_cost;
348  best_j = j;
349  }
350  }
351  this_forward_cost[i] = best_cost;
352  state_info_[i].backpointer = best_j;
353  }
354  } else {
355  int32 last_backpointer = 0;
356  for (int32 i = 0; i < num_states; i++) {
357  int32 start_j = last_backpointer;
358  BaseFloat best_cost = (start_j - i) * (start_j - i) * inter_frame_factor
359  + prev_forward_cost[start_j];
360  int32 best_j = start_j;
361 
362  for (int32 j = start_j + 1; j < num_states; j++) {
363  BaseFloat this_cost = (j - i) * (j - i) * inter_frame_factor
364  + prev_forward_cost[j];
365  if (this_cost < best_cost) {
366  best_cost = this_cost;
367  best_j = j;
368  } else { // as soon as the costs stop improving, we stop searching.
369  break; // this is a loose lower bound we're getting.
370  }
371  }
372  state_info_[i].backpointer = best_j;
373  this_forward_cost[i] = best_cost;
374  bounds[i].first = best_j; // this is now a lower bound on the
375  // backpointer.
376  bounds[i].second = num_states - 1; // we have no meaningful upper bound
377  // yet.
378  last_backpointer = best_j;
379  }
380 
381  // We iterate, progressively refining the upper and lower bounds until they
382  // meet and we know that the resulting backtraces are optimal. Each
383  // iteration takes time linear in num_states. We won't normally iterate as
384  // far as num_states; normally we only do two iterations; when printing out
385  // the number of iterations, it's rarely more than that (once I saw seven
386  // iterations). Anyway, this part of the computation does not dominate.
387  for (int32 iter = 0; iter < num_states; iter++) {
388  bool changed = false;
389  if (iter % 2 == 0) { // go backwards through the states
390  last_backpointer = num_states - 1;
391  for (int32 i = num_states - 1; i >= 0; i--) {
392  int32 lower_bound = bounds[i].first,
393  upper_bound = std::min(last_backpointer, bounds[i].second);
394  if (upper_bound == lower_bound) {
395  last_backpointer = lower_bound;
396  continue;
397  }
398  BaseFloat best_cost = this_forward_cost[i];
399  int32 best_j = state_info_[i].backpointer, initial_best_j = best_j;
400 
401  if (best_j == upper_bound) {
402  // if best_j already equals upper bound, don't bother tightening the
403  // upper bound, we'll tighten the lower bound when the time comes.
404  last_backpointer = best_j;
405  continue;
406  }
407  // Below, we have j > lower_bound + 1 because we know we've already
408  // evaluated lower_bound and lower_bound + 1 [via knowledge of
409  // this algorithm.]
410  for (int32 j = upper_bound; j > lower_bound + 1; j--) {
411  BaseFloat this_cost = (j - i) * (j - i) * inter_frame_factor
412  + prev_forward_cost[j];
413  if (this_cost < best_cost) {
414  best_cost = this_cost;
415  best_j = j;
416  } else { // as soon as the costs stop improving, we stop searching,
417  // unless the best j is still lower than j, in which case
418  // we obviously need to keep moving.
419  if (best_j > j)
420  break; // this is a loose lower bound we're getting.
421  }
422  }
423  // our "best_j" is now an upper bound on the backpointer.
424  bounds[i].second = best_j;
425  if (best_j != initial_best_j) {
426  this_forward_cost[i] = best_cost;
427  state_info_[i].backpointer = best_j;
428  changed = true;
429  }
430  last_backpointer = best_j;
431  }
432  } else { // go forwards through the states.
433  last_backpointer = 0;
434  for (int32 i = 0; i < num_states; i++) {
435  int32 lower_bound = std::max(last_backpointer, bounds[i].first),
436  upper_bound = bounds[i].second;
437  if (upper_bound == lower_bound) {
438  last_backpointer = lower_bound;
439  continue;
440  }
441  BaseFloat best_cost = this_forward_cost[i];
442  int32 best_j = state_info_[i].backpointer, initial_best_j = best_j;
443 
444  if (best_j == lower_bound) {
445  // if best_j already equals lower bound, we don't bother tightening
446  // the lower bound, we'll tighten the upper bound when the time
447  // comes.
448  last_backpointer = best_j;
449  continue;
450  }
451  // Below, we have j < upper_bound because we know we've already
452  // evaluated that point.
453  for (int32 j = lower_bound; j < upper_bound - 1; j++) {
454  BaseFloat this_cost = (j - i) * (j - i) * inter_frame_factor
455  + prev_forward_cost[j];
456  if (this_cost < best_cost) {
457  best_cost = this_cost;
458  best_j = j;
459  } else { // as soon as the costs stop improving, we stop searching,
460  // unless the best j is still higher than j, in which case
461  // we obviously need to keep moving.
462  if (best_j < j)
463  break; // this is a loose lower bound we're getting.
464  }
465  }
466  // our "best_j" is now a lower bound on the backpointer.
467  bounds[i].first = best_j;
468  if (best_j != initial_best_j) {
469  this_forward_cost[i] = best_cost;
470  state_info_[i].backpointer = best_j;
471  changed = true;
472  }
473  last_backpointer = best_j;
474  }
475  }
476  if (!changed)
477  break;
478  }
479  }
480  // The next statement is needed due to RecomputeBacktraces: we have to
481  // invalidate the previously computed best-state info.
482  cur_best_state_ = -1;
483  this_forward_cost_vec->AddVec(1.0, local_cost);
484 }
kaldi::int32 int32
float BaseFloat
Definition: kaldi-types.h:29
double Log(double x)
Definition: kaldi-math.h:100
void ComputeLocalCost(const VectorBase< BaseFloat > &nccf_pitch, const VectorBase< BaseFloat > &lags, const PitchExtractionOptions &opts, VectorBase< BaseFloat > *local_cost)
This function computes the local-cost for the Viterbi computation, see eq.
bool pitch_use_naive_search
std::vector< StateInfo > state_info_
int32 cur_best_state_
The current best state in the backtrace from the end.

◆ ComputeLatency()

int32 ComputeLatency ( int32  max_latency)

This function may be called on the last (most recent) PitchFrameInfo object; it computes how many frames of latency there is because the traceback has not yet settled on a single value for frames in the past.

It actually returns the minimum of max_latency and the actual latency, which is an optimization because we won't care about latency past a user-specified maximum latency.

Definition at line 514 of file pitch-functions.cc.

References KALDI_ASSERT, PitchFrameInfo::prev_info_, PitchFrameInfo::state_info_, and PitchFrameInfo::state_offset_.

514  {
515  if (max_latency <= 0) return 0;
516 
517  int32 latency = 0;
518 
519  // This function would naturally be recursive, but we have coded this to avoid
520  // recursion, which would otherwise eat up the stack. Think of it as a static
521  // member function, except we do use "this" right at the beginning.
522  // This function is called only on the most recent PitchFrameInfo object.
523  int32 num_states = state_info_.size();
524  int32 min_living_state = 0, max_living_state = num_states - 1;
525  PitchFrameInfo *this_info = this; // it will change in the loop.
526 
527 
528  for (; this_info != NULL && latency < max_latency;) {
529  int32 offset = this_info->state_offset_;
530  KALDI_ASSERT(min_living_state >= offset &&
531  max_living_state - offset < this_info->state_info_.size());
532  min_living_state =
533  this_info->state_info_[min_living_state - offset].backpointer;
534  max_living_state =
535  this_info->state_info_[max_living_state - offset].backpointer;
536  if (min_living_state == max_living_state) {
537  return latency;
538  }
539  this_info = this_info->prev_info_;
540  if (this_info != NULL) // avoid incrementing latency for frame -1,
541  latency++; // as it's not a real frame.
542  }
543  return latency;
544 }
PitchFrameInfo(int32 num_states)
This constructor is used for frame -1; it sets the costs to be all zeros the pov_nccf&#39;s to zero and t...
kaldi::int32 int32
std::vector< StateInfo > state_info_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ SetBestState()

void SetBestState ( int32  best_state,
std::vector< std::pair< int32, BaseFloat > > &  lag_nccf 
)

This function may be called for the last (most recent) PitchFrameInfo object with the best state (obtained from the externally held forward-costs).

It traces back as far as needed to set the cur_best_state_, and as it's going it sets the lag-index and pov_nccf in pitch_pov_iter, which when it's called is an iterator to where to put the info for the final state; the iterator will be decremented inside this function.

Definition at line 486 of file pitch-functions.cc.

References PitchFrameInfo::cur_best_state_, KALDI_ASSERT, PitchFrameInfo::prev_info_, PitchFrameInfo::state_info_, and PitchFrameInfo::state_offset_.

488  {
489 
490  // This function would naturally be recursive, but we have coded this to avoid
491  // recursion, which would otherwise eat up the stack. Think of it as a static
492  // member function, except we do use "this" right at the beginning.
493 
494  std::vector<std::pair<int32, BaseFloat> >::reverse_iterator iter = lag_nccf.rbegin();
495 
496  PitchFrameInfo *this_info = this; // it will change in the loop.
497  while (this_info != NULL) {
498  PitchFrameInfo *prev_info = this_info->prev_info_;
499  if (best_state == this_info->cur_best_state_)
500  return; // no change
501  if (prev_info != NULL) // don't write anything for frame -1.
502  iter->first = best_state;
503  size_t state_info_index = best_state - this_info->state_offset_;
504  KALDI_ASSERT(state_info_index < this_info->state_info_.size());
505  this_info->cur_best_state_ = best_state;
506  best_state = this_info->state_info_[state_info_index].backpointer;
507  if (prev_info != NULL) // don't write anything for frame -1.
508  iter->second = this_info->state_info_[state_info_index].pov_nccf;
509  this_info = prev_info;
510  if (this_info != NULL) ++iter;
511  }
512 }
PitchFrameInfo(int32 num_states)
This constructor is used for frame -1; it sets the costs to be all zeros the pov_nccf&#39;s to zero and t...
std::vector< StateInfo > state_info_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ SetNccfPov()

void SetNccfPov ( const VectorBase< BaseFloat > &  nccf_pov)

Record the nccf_pov value.

Parameters
nccf_povThe nccf as computed for the POV computation (without ballast).

Definition at line 299 of file pitch-functions.cc.

References VectorBase< Real >::Dim(), rnnlm::i, KALDI_ASSERT, and PitchFrameInfo::state_info_.

299  {
300  int32 num_states = nccf_pov.Dim();
301  KALDI_ASSERT(num_states == state_info_.size());
302  for (int32 i = 0; i < num_states; i++)
303  state_info_[i].pov_nccf = nccf_pov(i);
304 }
kaldi::int32 int32
std::vector< StateInfo > state_info_
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ UpdatePreviousBestState()

bool UpdatePreviousBestState ( PitchFrameInfo prev_frame)

This function updates.

Member Data Documentation

◆ cur_best_state_

int32 cur_best_state_
private

The current best state in the backtrace from the end.

Definition at line 278 of file pitch-functions.cc.

Referenced by PitchFrameInfo::ComputeBacktraces(), and PitchFrameInfo::SetBestState().

◆ prev_info_

PitchFrameInfo* prev_info_
private

The structure for the previous frame.

Definition at line 281 of file pitch-functions.cc.

Referenced by PitchFrameInfo::ComputeLatency(), and PitchFrameInfo::SetBestState().

◆ state_info_

◆ state_offset_

int32 state_offset_
private

the state index of the first entry in "state_info"; this will initially be zero, but after cleanup might be nonzero.

Definition at line 275 of file pitch-functions.cc.

Referenced by PitchFrameInfo::ComputeLatency(), and PitchFrameInfo::SetBestState().


The documentation for this class was generated from the following file: