"Holder types"

Holder types are types that are used as template arguments to the Table types (see "Table types and related functions"), and which help the Table types to read and write the object of type SomeHolder::T; see Holders as helpers to Table classes for more information. More...

Classes

class  KaldiObjectHolder< KaldiType >
 KaldiObjectHolder works for Kaldi objects that have the "standard" Read and Write functions, and a copy constructor. More...
 
class  BasicHolder< BasicType >
 BasicHolder is valid for float, double, bool, and integer types. More...
 
class  BasicVectorHolder< BasicType >
 A Holder for a vector of basic types, e.g. More...
 
class  BasicVectorVectorHolder< BasicType >
 BasicVectorVectorHolder is a Holder for a vector of vector of a basic type, e.g. More...
 
class  BasicPairVectorHolder< BasicType >
 BasicPairVectorHolder is a Holder for a vector of pairs of a basic type, e.g. More...
 
class  TokenHolder
 
class  TokenVectorHolder
 
class  HtkMatrixHolder
 
class  SphinxMatrixHolder< kFeatDim >
 A class for reading/writing Sphinx format matrices. More...
 

Functions

template<class T >
bool ExtractObjectRange (const T &input, const std::string &range, T *output)
 This templated function exists so that we can write .scp files with 'object ranges' specified: the canonical example is a [first:last] range of rows of a matrix, or [first-row:last-row,first-column,last-column] of a matrix. More...
 
template<class Real >
bool ExtractObjectRange (const Matrix< Real > &input, const std::string &range, Matrix< Real > *output)
 The template is specialized with a version that actually does something, for types Matrix<float> and Matrix<double>. More...
 
template<class Real >
bool ExtractObjectRange (const Vector< Real > &input, const std::string &range, Vector< Real > *output)
 The template is specialized types Vector<float> and Vector<double>. More...
 
bool ExtractObjectRange (const GeneralMatrix &input, const std::string &range, GeneralMatrix *output)
 GeneralMatrix is always of type BaseFloat. More...
 
template<class Real >
bool ExtractObjectRange (const CompressedMatrix &input, const std::string &range, Matrix< Real > *output)
 CompressedMatrix is always of the type BaseFloat but it is more efficient to provide template as it uses CompressedMatrix's own conversion to Matrix<Real> More...
 
bool ExtractRangeSpecifier (const std::string &rxfilename_with_range, std::string *data_rxfilename, std::string *range)
 

Detailed Description

Holder types are types that are used as template arguments to the Table types (see "Table types and related functions"), and which help the Table types to read and write the object of type SomeHolder::T; see Holders as helpers to Table classes for more information.

Function Documentation

◆ ExtractObjectRange() [1/5]

bool ExtractObjectRange ( const GeneralMatrix input,
const std::string &  range,
GeneralMatrix output 
)

GeneralMatrix is always of type BaseFloat.

Definition at line 88 of file kaldi-holder.cc.

References GeneralMatrix::Clear(), GeneralMatrix::GetCompressedMatrix(), GeneralMatrix::GetFullMatrix(), GeneralMatrix::GetMatrix(), KALDI_ASSERT, kaldi::kCompressedMatrix, kaldi::kFullMatrix, kaldi::kSparseMatrix, GeneralMatrix::SwapFullMatrix(), and GeneralMatrix::Type().

Referenced by kaldi::ExtractObjectRange(), KaldiObjectHolder< KaldiType >::ExtractRange(), and kaldi::ReadKaldiObject().

89  {
90  // We just inspect input's type and forward to the correct implementation
91  // if available. For kSparseMatrix, we do just fairly inefficient conversion
92  // to a full matrix.
93  Matrix<BaseFloat> output_mat;
94  if (input.Type() == kFullMatrix) {
95  const Matrix<BaseFloat> &in = input.GetFullMatrix();
96  ExtractObjectRange(in, range, &output_mat);
97  } else if (input.Type() == kCompressedMatrix) {
98  const CompressedMatrix &in = input.GetCompressedMatrix();
99  ExtractObjectRange(in, range, &output_mat);
100  } else {
101  KALDI_ASSERT(input.Type() == kSparseMatrix);
102  // NOTE: this is fairly inefficient, so if this happens to be bottleneck
103  // it should be re-implemented more efficiently.
104  Matrix<BaseFloat> input_mat;
105  input.GetMatrix(&input_mat);
106  ExtractObjectRange(input_mat, range, &output_mat);
107  }
108  output->Clear();
109  output->SwapFullMatrix(&output_mat);
110  return true;
111 }
template bool ExtractObjectRange(const Vector< float > &, const std::string &, Vector< float > *)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ ExtractObjectRange() [2/5]

bool ExtractObjectRange ( const CompressedMatrix input,
const std::string &  range,
Matrix< Real > *  output 
)

CompressedMatrix is always of the type BaseFloat but it is more efficient to provide template as it uses CompressedMatrix's own conversion to Matrix<Real>

Definition at line 114 of file kaldi-holder.cc.

References CompressedMatrix::CopyToMat(), kaldi::ExtractObjectRange(), KALDI_ERR, kaldi::kUndefined, CompressedMatrix::NumCols(), CompressedMatrix::NumRows(), kaldi::ParseMatrixRangeSpecifier(), and Matrix< Real >::Resize().

115  {
116  std::vector<int32> row_range, col_range;
117 
118  if (!ParseMatrixRangeSpecifier(range, input.NumRows(), input.NumCols(),
119  &row_range, &col_range)) {
120  KALDI_ERR << "Could not parse range specifier \"" << range << "\".";
121  }
122 
123  int32 row_size = std::min(row_range[1], input.NumRows() - 1)
124  - row_range[0] + 1,
125  col_size = col_range[1] - col_range[0] + 1;
126 
127  output->Resize(row_size, col_size, kUndefined);
128  input.CopyToMat(row_range[0], col_range[0], output);
129  return true;
130 }
kaldi::int32 int32
#define KALDI_ERR
Definition: kaldi-error.h:147
bool ParseMatrixRangeSpecifier(const std::string &range, const int rows, const int cols, std::vector< int32 > *row_range, std::vector< int32 > *col_range)
Definition: kaldi-holder.cc:33

◆ ExtractObjectRange() [3/5]

bool ExtractObjectRange ( const Matrix< Real > &  input,
const std::string &  range,
Matrix< Real > *  output 
)

The template is specialized with a version that actually does something, for types Matrix<float> and Matrix<double>.

We can later add versions of this template for other types, such as Vector, which can meaningfully have ranges extracted.

Definition at line 139 of file kaldi-holder.cc.

References MatrixBase< Real >::CopyFromMat(), kaldi::ExtractObjectRange(), KALDI_ERR, kaldi::kUndefined, MatrixBase< Real >::NumCols(), MatrixBase< Real >::NumRows(), kaldi::ParseMatrixRangeSpecifier(), MatrixBase< Real >::Range(), and Matrix< Real >::Resize().

140  {
141  std::vector<int32> row_range, col_range;
142 
143  if (!ParseMatrixRangeSpecifier(range, input.NumRows(), input.NumCols(),
144  &row_range, &col_range)) {
145  KALDI_ERR << "Could not parse range specifier \"" << range << "\".";
146  }
147 
148  int32 row_size = std::min(row_range[1], input.NumRows() - 1)
149  - row_range[0] + 1,
150  col_size = col_range[1] - col_range[0] + 1;
151  output->Resize(row_size, col_size, kUndefined);
152  output->CopyFromMat(input.Range(row_range[0], row_size,
153  col_range[0], col_size));
154  return true;
155 }
kaldi::int32 int32
#define KALDI_ERR
Definition: kaldi-error.h:147
bool ParseMatrixRangeSpecifier(const std::string &range, const int rows, const int cols, std::vector< int32 > *row_range, std::vector< int32 > *col_range)
Definition: kaldi-holder.cc:33

◆ ExtractObjectRange() [4/5]

bool ExtractObjectRange ( const Vector< Real > &  input,
const std::string &  range,
Vector< Real > *  output 
)

The template is specialized types Vector<float> and Vector<double>.

Definition at line 164 of file kaldi-holder.cc.

References VectorBase< Real >::CopyFromVec(), VectorBase< Real >::Dim(), kaldi::ExtractObjectRange(), KALDI_ERR, KALDI_WARN, kaldi::kUndefined, VectorBase< Real >::Range(), Vector< Real >::Resize(), kaldi::SplitStringToIntegers(), and kaldi::SplitStringToVector().

165  {
166  if (range.empty()) {
167  KALDI_ERR << "Empty range specifier.";
168  return false;
169  }
170  std::vector<std::string> splits;
171  SplitStringToVector(range, ",", false, &splits);
172  if (!((splits.size() == 1 && !splits[0].empty()))) {
173  KALDI_ERR << "Invalid range specifier for vector: " << range;
174  return false;
175  }
176  std::vector<int32> index_range;
177  bool status = true;
178  if (splits[0] != ":")
179  status = SplitStringToIntegers(splits[0], ":", false, &index_range);
180 
181  if (index_range.size() == 0) {
182  index_range.push_back(0);
183  index_range.push_back(input.Dim() - 1);
184  }
185 
186  // Length tolerance of 3 -- 2 to account for edge effects when
187  // frame-length is 25ms and frame-shift is 10ms, and 1 for rounding effects
188  // since segments are usually retained up to 2 decimal places.
189  int32 length_tolerance = 3;
190  if (!(status && index_range.size() == 2 &&
191  index_range[0] >= 0 && index_range[0] <= index_range[1] &&
192  index_range[1] < input.Dim() + length_tolerance)) {
193  KALDI_ERR << "Invalid range specifier: " << range
194  << " for vector of size " << input.Dim();
195  return false;
196  }
197 
198  if (index_range[1] >= input.Dim())
199  KALDI_WARN << "Range " << index_range[0] << ":" << index_range[1]
200  << " goes beyond the vector dimension " << input.Dim();
201  int32 size = std::min(index_range[1], input.Dim() - 1) - index_range[0] + 1;
202  output->Resize(size, kUndefined);
203  output->CopyFromVec(input.Range(index_range[0], size));
204  return true;
205 }
bool SplitStringToIntegers(const std::string &full, const char *delim, bool omit_empty_strings, std::vector< I > *out)
Split a string (e.g.
Definition: text-utils.h:68
kaldi::int32 int32
void SplitStringToVector(const std::string &full, const char *delim, bool omit_empty_strings, std::vector< std::string > *out)
Split a string using any of the single character delimiters.
Definition: text-utils.cc:63
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_WARN
Definition: kaldi-error.h:150

◆ ExtractObjectRange() [5/5]

bool kaldi::ExtractObjectRange ( const T &  input,
const std::string &  range,
T *  output 
)

This templated function exists so that we can write .scp files with 'object ranges' specified: the canonical example is a [first:last] range of rows of a matrix, or [first-row:last-row,first-column,last-column] of a matrix.

We can also support [begin-time:end-time] of a wave file. The string 'range' is whatever is in the square brackets; it is parsed inside this function. This function returns true if the partial object was successfully extracted, and false if there was an error such as an invalid range. The generic version of this function just fails; we overload the template whenever we need it for a specific class.

Definition at line 233 of file kaldi-holder.h.

References kaldi::ExtractObjectRange(), kaldi::ExtractRangeSpecifier(), and KALDI_ERR.

233  {
234  KALDI_ERR << "Ranges not supported for objects of this type.";
235  return false;
236 }
#define KALDI_ERR
Definition: kaldi-error.h:147

◆ ExtractRangeSpecifier()

bool ExtractRangeSpecifier ( const std::string &  rxfilename_with_range,
std::string *  data_rxfilename,
std::string *  range 
)

Definition at line 213 of file kaldi-holder.cc.

References KALDI_ERR, and kaldi::SplitStringToVector().

Referenced by kaldi::ExtractObjectRange(), RandomAccessTableReaderScriptImpl< Holder >::HasKeyInternal(), SequentialTableReaderScriptImpl< Holder >::NextScpLine(), and kaldi::ReadKaldiObject().

215  {
216  if (rxfilename_with_range.empty() ||
217  rxfilename_with_range[rxfilename_with_range.size()-1] != ']')
218  KALDI_ERR << "ExtractRangeRspecifier called wrongly.";
219  std::vector<std::string> splits;
220  SplitStringToVector(rxfilename_with_range, "[", false, &splits);
221  if (splits.size() == 2 && !splits[0].empty() && splits[1].size() > 1) {
222  *data_rxfilename = splits[0];
223  range->assign(splits[1], 0, splits[1].size()-1);
224  return true;
225  }
226  return false;
227 }
void SplitStringToVector(const std::string &full, const char *delim, bool omit_empty_strings, std::vector< std::string > *out)
Split a string using any of the single character delimiters.
Definition: text-utils.cc:63
#define KALDI_ERR
Definition: kaldi-error.h:147