CompressedMatrix Class Reference

#include <compressed-matrix.h>

Collaboration diagram for CompressedMatrix:

Classes

struct  GlobalHeader
 
struct  PerColHeader
 

Public Member Functions

 CompressedMatrix ()
 
 ~CompressedMatrix ()
 
template<typename Real >
 CompressedMatrix (const MatrixBase< Real > &mat, CompressionMethod method=kAutomaticMethod)
 
 CompressedMatrix (const CompressedMatrix &mat, const MatrixIndexT row_offset, const MatrixIndexT num_rows, const MatrixIndexT col_offset, const MatrixIndexT num_cols, bool allow_padding=false)
 Initializer that can be used to select part of an existing CompressedMatrix without un-compressing and re-compressing (note: unlike similar initializers for class Matrix, it doesn't point to the same memory location). More...
 
void * Data () const
 
template<typename Real >
void CopyFromMat (const MatrixBase< Real > &mat, CompressionMethod method=kAutomaticMethod)
 This will resize *this and copy the contents of mat to *this. More...
 
 CompressedMatrix (const CompressedMatrix &mat)
 
CompressedMatrixoperator= (const CompressedMatrix &mat)
 
template<typename Real >
CompressedMatrixoperator= (const MatrixBase< Real > &mat)
 
template<typename Real >
void CopyToMat (MatrixBase< Real > *mat, MatrixTransposeType trans=kNoTrans) const
 Copies contents to matrix. More...
 
void Write (std::ostream &os, bool binary) const
 
void Read (std::istream &is, bool binary)
 
MatrixIndexT NumRows () const
 Returns number of rows (or zero for emtpy matrix). More...
 
MatrixIndexT NumCols () const
 Returns number of columns (or zero for emtpy matrix). More...
 
template<typename Real >
void CopyRowToVec (MatrixIndexT row, VectorBase< Real > *v) const
 Copies row #row of the matrix into vector v. More...
 
template<typename Real >
void CopyColToVec (MatrixIndexT col, VectorBase< Real > *v) const
 Copies column #col of the matrix into vector v. More...
 
template<typename Real >
void CopyToMat (int32 row_offset, int32 column_offset, MatrixBase< Real > *dest) const
 Copies submatrix of compressed matrix into matrix dest. More...
 
void Swap (CompressedMatrix *other)
 
void Clear ()
 
void Scale (float alpha)
 scales all elements of matrix by alpha. More...
 

Private Types

enum  DataFormat { kOneByteWithColHeaders = 1, kTwoByte = 2, kOneByte = 3 }
 

Static Private Member Functions

static void * AllocateData (int32 num_bytes)
 
template<typename Real >
static void ComputeGlobalHeader (const MatrixBase< Real > &mat, CompressionMethod method, GlobalHeader *header)
 
static MatrixIndexT DataSize (const GlobalHeader &header)
 
template<typename Real >
static void CompressColumn (const GlobalHeader &global_header, const Real *data, MatrixIndexT stride, int32 num_rows, PerColHeader *header, uint8 *byte_data)
 
template<typename Real >
static void ComputeColHeader (const GlobalHeader &global_header, const Real *data, MatrixIndexT stride, int32 num_rows, PerColHeader *header)
 
static uint16 FloatToUint16 (const GlobalHeader &global_header, float value)
 
static uint8 FloatToUint8 (const GlobalHeader &global_header, float value)
 
static float Uint16ToFloat (const GlobalHeader &global_header, uint16 value)
 
static uint8 FloatToChar (float p0, float p25, float p75, float p100, float value)
 
static float CharToFloat (float p0, float p25, float p75, float p100, uint8 value)
 

Private Attributes

void * data_
 

Friends

class Matrix< float >
 
class Matrix< double >
 

Detailed Description

Definition at line 91 of file compressed-matrix.h.

Member Enumeration Documentation

◆ DataFormat

enum DataFormat
private
Enumerator
kOneByteWithColHeaders 
kTwoByte 
kOneByte 

Definition at line 201 of file compressed-matrix.h.

Constructor & Destructor Documentation

◆ CompressedMatrix() [1/4]

CompressedMatrix ( )
inline

Definition at line 93 of file compressed-matrix.h.

Referenced by CompressedMatrix::CompressedMatrix(), and CompressedMatrix::Data().

93 : data_(NULL) { }

◆ ~CompressedMatrix()

~CompressedMatrix ( )
inline

Definition at line 95 of file compressed-matrix.h.

References CompressedMatrix::Clear().

◆ CompressedMatrix() [2/4]

CompressedMatrix ( const MatrixBase< Real > &  mat,
CompressionMethod  method = kAutomaticMethod 
)
inlineexplicit

Definition at line 98 of file compressed-matrix.h.

References CompressedMatrix::CompressedMatrix(), and CompressedMatrix::CopyFromMat().

99  :
100  data_(NULL) { CopyFromMat(mat, method); }
void CopyFromMat(const MatrixBase< Real > &mat, CompressionMethod method=kAutomaticMethod)
This will resize *this and copy the contents of mat to *this.

◆ CompressedMatrix() [3/4]

CompressedMatrix ( const CompressedMatrix mat,
const MatrixIndexT  row_offset,
const MatrixIndexT  num_rows,
const MatrixIndexT  col_offset,
const MatrixIndexT  num_cols,
bool  allow_padding = false 
)

Initializer that can be used to select part of an existing CompressedMatrix without un-compressing and re-compressing (note: unlike similar initializers for class Matrix, it doesn't point to the same memory location).

This creates a CompressedMatrix with the size (num_rows, num_cols) starting at (row_offset, col_offset).

If you specify allow_padding = true, it is permitted to have row_offset < 0 and row_offset + num_rows > mat.NumRows(), and the result will contain repeats of the first and last rows of 'mat' as necessary.

Definition at line 199 of file compressed-matrix.cc.

References CompressedMatrix::AllocateData(), CompressedMatrix::CopyToMat(), CompressedMatrix::Data(), CompressedMatrix::data_, CompressedMatrix::DataSize(), CompressedMatrix::GlobalHeader::format, rnnlm::i, rnnlm::j, KALDI_ASSERT, KALDI_COMPILE_TIME_ASSERT, CompressedMatrix::kOneByte, CompressedMatrix::kOneByteWithColHeaders, CompressedMatrix::kTwoByte, kaldi::kTwoByteAuto, kaldi::kUndefined, CompressedMatrix::GlobalHeader::num_cols, CompressedMatrix::GlobalHeader::num_rows, CompressedMatrix::NumCols(), CompressedMatrix::NumRows(), and CompressedMatrix::Swap().

205  : data_(NULL) {
206  int32 old_num_rows = cmat.NumRows(), old_num_cols = cmat.NumCols();
207 
208  if (old_num_rows == 0) {
209  KALDI_ASSERT(num_rows == 0 && num_cols == 0);
210  // The empty matrix is stored as a zero pointer.
211  return;
212  }
213 
214  KALDI_ASSERT(row_offset < old_num_rows);
215  KALDI_ASSERT(col_offset < old_num_cols);
216  KALDI_ASSERT(row_offset >= 0 || allow_padding);
217  KALDI_ASSERT(col_offset >= 0);
218  KALDI_ASSERT(row_offset + num_rows <= old_num_rows || allow_padding);
219  KALDI_ASSERT(col_offset + num_cols <= old_num_cols);
220 
221  if (num_rows == 0 || num_cols == 0) { return; }
222 
223  bool padding_is_used = (row_offset < 0 ||
224  row_offset + num_rows > old_num_rows);
225 
226  GlobalHeader new_global_header;
227  KALDI_COMPILE_TIME_ASSERT(sizeof(new_global_header) == 20);
228 
229  GlobalHeader *old_global_header = reinterpret_cast<GlobalHeader*>(cmat.Data());
230 
231  new_global_header = *old_global_header;
232  new_global_header.num_cols = num_cols;
233  new_global_header.num_rows = num_rows;
234 
235  // We don't switch format from 1 -> 2 (in case of size reduction) yet; if this
236  // is needed, we will do this below by creating a temporary Matrix.
237  new_global_header.format = old_global_header->format;
238 
239  data_ = AllocateData(DataSize(new_global_header)); // allocate memory
240  *(reinterpret_cast<GlobalHeader*>(data_)) = new_global_header;
241 
242 
243  DataFormat format = static_cast<DataFormat>(old_global_header->format);
244  if (format == kOneByteWithColHeaders) {
245  PerColHeader *old_per_col_header =
246  reinterpret_cast<PerColHeader*>(old_global_header + 1);
247  uint8 *old_byte_data =
248  reinterpret_cast<uint8*>(old_per_col_header +
249  old_global_header->num_cols);
250  PerColHeader *new_per_col_header =
251  reinterpret_cast<PerColHeader*>(
252  reinterpret_cast<GlobalHeader*>(data_) + 1);
253 
254  memcpy(new_per_col_header, old_per_col_header + col_offset,
255  sizeof(PerColHeader) * num_cols);
256 
257  uint8 *new_byte_data =
258  reinterpret_cast<uint8*>(new_per_col_header + num_cols);
259  if (!padding_is_used) {
260  uint8 *old_start_of_subcol =
261  old_byte_data + row_offset + (col_offset * old_num_rows),
262  *new_start_of_col = new_byte_data;
263  for (int32 i = 0; i < num_cols; i++) {
264  memcpy(new_start_of_col, old_start_of_subcol, num_rows);
265  new_start_of_col += num_rows;
266  old_start_of_subcol += old_num_rows;
267  }
268  } else {
269  uint8 *old_start_of_col =
270  old_byte_data + (col_offset * old_num_rows),
271  *new_start_of_col = new_byte_data;
272  for (int32 i = 0; i < num_cols; i++) {
273 
274  for (int32 j = 0; j < num_rows; j++) {
275  int32 old_j = j + row_offset;
276  if (old_j < 0) old_j = 0;
277  else if (old_j >= old_num_rows) old_j = old_num_rows - 1;
278  new_start_of_col[j] = old_start_of_col[old_j];
279  }
280  new_start_of_col += num_rows;
281  old_start_of_col += old_num_rows;
282  }
283  }
284  } else if (format == kTwoByte) {
285  const uint16 *old_data =
286  reinterpret_cast<const uint16*>(old_global_header + 1);
287  uint16 *new_row_data =
288  reinterpret_cast<uint16*>(reinterpret_cast<GlobalHeader*>(data_) + 1);
289 
290  for (int32 row = 0; row < num_rows; row++) {
291  int32 old_row = row + row_offset;
292  // The next two lines are only relevant if padding_is_used.
293  if (old_row < 0) old_row = 0;
294  else if (old_row >= old_num_rows) old_row = old_num_rows - 1;
295  const uint16 *old_row_data =
296  old_data + col_offset + (old_num_cols * old_row);
297  memcpy(new_row_data, old_row_data, sizeof(uint16) * num_cols);
298  new_row_data += num_cols;
299  }
300  } else {
301  KALDI_ASSERT(format == kOneByte);
302  const uint8 *old_data =
303  reinterpret_cast<const uint8*>(old_global_header + 1);
304  uint8 *new_row_data =
305  reinterpret_cast<uint8*>(reinterpret_cast<GlobalHeader*>(data_) + 1);
306 
307  for (int32 row = 0; row < num_rows; row++) {
308  int32 old_row = row + row_offset;
309  // The next two lines are only relevant if padding_is_used.
310  if (old_row < 0) old_row = 0;
311  else if (old_row >= old_num_rows) old_row = old_num_rows - 1;
312  const uint8 *old_row_data =
313  old_data + col_offset + (old_num_cols * old_row);
314  memcpy(new_row_data, old_row_data, sizeof(uint8) * num_cols);
315  new_row_data += num_cols;
316  }
317  }
318 
319  if (num_rows < 8 && format == kOneByteWithColHeaders) {
320  // format was 1 but we want it to be 2 -> create a temporary
321  // Matrix (uncompress), re-compress, and swap.
322  // This gives us almost exact reconstruction while saving
323  // memory (the elements take more space but there will be
324  // no per-column headers).
325  Matrix<float> temp(this->NumRows(), this->NumCols(),
326  kUndefined);
327  this->CopyToMat(&temp);
328  CompressedMatrix temp_cmat(temp, kTwoByteAuto);
329  this->Swap(&temp_cmat);
330  }
331 }
void Swap(CompressedMatrix *other)
kaldi::int32 int32
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static void * AllocateData(int32 num_bytes)
MatrixIndexT NumCols() const
Returns number of columns (or zero for emtpy matrix).
void CopyToMat(MatrixBase< Real > *mat, MatrixTransposeType trans=kNoTrans) const
Copies contents to matrix.
#define KALDI_COMPILE_TIME_ASSERT(b)
Definition: kaldi-utils.h:131
static MatrixIndexT DataSize(const GlobalHeader &header)

◆ CompressedMatrix() [4/4]

Definition at line 859 of file compressed-matrix.cc.

859  : data_(NULL) {
860  *this = mat; // use assignment operator.
861 }

Member Function Documentation

◆ AllocateData()

void * AllocateData ( int32  num_bytes)
staticprivate

Definition at line 524 of file compressed-matrix.cc.

References KALDI_ASSERT, and KALDI_COMPILE_TIME_ASSERT.

Referenced by CompressedMatrix::CompressedMatrix(), CompressedMatrix::CopyFromMat(), CompressedMatrix::operator=(), and CompressedMatrix::Read().

524  {
525  KALDI_ASSERT(num_bytes > 0);
526  KALDI_COMPILE_TIME_ASSERT(sizeof(float) == 4);
527  // round size up to nearest number of floats.
528  return reinterpret_cast<void*>(new float[(num_bytes/3) + 4]);
529 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define KALDI_COMPILE_TIME_ASSERT(b)
Definition: kaldi-utils.h:131

◆ CharToFloat()

float CharToFloat ( float  p0,
float  p25,
float  p75,
float  p100,
uint8  value 
)
inlinestaticprivate

Definition at line 490 of file compressed-matrix.cc.

Referenced by CompressedMatrix::CopyColToVec(), CompressedMatrix::CopyRowToVec(), and CompressedMatrix::CopyToMat().

492  {
493  if (value <= 64) {
494  return p0 + (p25 - p0) * value * (1/64.0);
495  } else if (value <= 192) {
496  return p25 + (p75 - p25) * (value - 64) * (1/128.0);
497  } else {
498  return p75 + (p100 - p75) * (value - 192) * (1/63.0);
499  }
500 }

◆ Clear()

void Clear ( )

Definition at line 852 of file compressed-matrix.cc.

References CompressedMatrix::data_.

Referenced by CompressedMatrix::operator=(), CompressedMatrix::Swap(), and CompressedMatrix::~CompressedMatrix().

852  {
853  if (data_ != NULL) {
854  delete [] static_cast<float*>(data_);
855  data_ = NULL;
856  }
857 }

◆ CompressColumn()

void CompressColumn ( const GlobalHeader global_header,
const Real *  data,
MatrixIndexT  stride,
int32  num_rows,
CompressedMatrix::PerColHeader header,
uint8 *  byte_data 
)
staticprivate

Definition at line 504 of file compressed-matrix.cc.

References CompressedMatrix::ComputeColHeader(), CompressedMatrix::FloatToChar(), rnnlm::i, CompressedMatrix::PerColHeader::percentile_0, CompressedMatrix::PerColHeader::percentile_100, CompressedMatrix::PerColHeader::percentile_25, CompressedMatrix::PerColHeader::percentile_75, and CompressedMatrix::Uint16ToFloat().

Referenced by CompressedMatrix::CopyFromMat().

508  {
509  ComputeColHeader(global_header, data, stride,
510  num_rows, header);
511 
512  float p0 = Uint16ToFloat(global_header, header->percentile_0),
513  p25 = Uint16ToFloat(global_header, header->percentile_25),
514  p75 = Uint16ToFloat(global_header, header->percentile_75),
515  p100 = Uint16ToFloat(global_header, header->percentile_100);
516 
517  for (int32 i = 0; i < num_rows; i++) {
518  Real this_data = data[i * stride];
519  byte_data[i] = FloatToChar(p0, p25, p75, p100, this_data);
520  }
521 }
kaldi::int32 int32
static float Uint16ToFloat(const GlobalHeader &global_header, uint16 value)
static uint8 FloatToChar(float p0, float p25, float p75, float p100, float value)
static void ComputeColHeader(const GlobalHeader &global_header, const Real *data, MatrixIndexT stride, int32 num_rows, PerColHeader *header)

◆ ComputeColHeader()

void ComputeColHeader ( const GlobalHeader global_header,
const Real *  data,
MatrixIndexT  stride,
int32  num_rows,
CompressedMatrix::PerColHeader header 
)
staticprivate

Definition at line 380 of file compressed-matrix.cc.

References CompressedMatrix::FloatToUint16(), rnnlm::i, KALDI_ASSERT, CompressedMatrix::PerColHeader::percentile_0, CompressedMatrix::PerColHeader::percentile_100, CompressedMatrix::PerColHeader::percentile_25, and CompressedMatrix::PerColHeader::percentile_75.

Referenced by CompressedMatrix::CompressColumn().

383  {
384  KALDI_ASSERT(num_rows > 0);
385  std::vector<Real> sdata(num_rows); // the sorted data.
386  for (size_t i = 0, size = sdata.size(); i < size; i++)
387  sdata[i] = data[i*stride];
388 
389  if (num_rows >= 5) {
390  int quarter_nr = num_rows/4;
391  // std::sort(sdata.begin(), sdata.end());
392  // The elements at positions 0, quarter_nr,
393  // 3*quarter_nr, and num_rows-1 need to be in sorted order.
394  std::nth_element(sdata.begin(), sdata.begin() + quarter_nr, sdata.end());
395  // Now, sdata.begin() + quarter_nr contains the element that would appear
396  // in sorted order, in that position.
397  std::nth_element(sdata.begin(), sdata.begin(), sdata.begin() + quarter_nr);
398  // Now, sdata.begin() and sdata.begin() + quarter_nr contain the elements
399  // that would appear at those positions in sorted order.
400  std::nth_element(sdata.begin() + quarter_nr + 1,
401  sdata.begin() + (3*quarter_nr), sdata.end());
402  // Now, sdata.begin(), sdata.begin() + quarter_nr, and sdata.begin() +
403  // 3*quarter_nr, contain the elements that would appear at those positions
404  // in sorted order.
405  std::nth_element(sdata.begin() + (3*quarter_nr) + 1, sdata.end() - 1,
406  sdata.end());
407  // Now, sdata.begin(), sdata.begin() + quarter_nr, and sdata.begin() +
408  // 3*quarter_nr, and sdata.end() - 1, contain the elements that would appear
409  // at those positions in sorted order.
410 
411  header->percentile_0 =
412  std::min<uint16>(FloatToUint16(global_header, sdata[0]), 65532);
413  header->percentile_25 =
414  std::min<uint16>(
415  std::max<uint16>(
416  FloatToUint16(global_header, sdata[quarter_nr]),
417  header->percentile_0 + static_cast<uint16>(1)), 65533);
418  header->percentile_75 =
419  std::min<uint16>(
420  std::max<uint16>(
421  FloatToUint16(global_header, sdata[3*quarter_nr]),
422  header->percentile_25 + static_cast<uint16>(1)), 65534);
423  header->percentile_100 = std::max<uint16>(
424  FloatToUint16(global_header, sdata[num_rows-1]),
425  header->percentile_75 + static_cast<uint16>(1));
426 
427  } else { // handle this pathological case.
428  std::sort(sdata.begin(), sdata.end());
429  // Note: we know num_rows is at least 1.
430  header->percentile_0 =
431  std::min<uint16>(FloatToUint16(global_header, sdata[0]),
432  65532);
433  if (num_rows > 1)
434  header->percentile_25 =
435  std::min<uint16>(
436  std::max<uint16>(FloatToUint16(global_header, sdata[1]),
437  header->percentile_0 + 1), 65533);
438  else
439  header->percentile_25 = header->percentile_0 + 1;
440  if (num_rows > 2)
441  header->percentile_75 =
442  std::min<uint16>(
443  std::max<uint16>(FloatToUint16(global_header, sdata[2]),
444  header->percentile_25 + 1), 65534);
445  else
446  header->percentile_75 = header->percentile_25 + 1;
447  if (num_rows > 3)
448  header->percentile_100 =
449  std::max<uint16>(FloatToUint16(global_header, sdata[3]),
450  header->percentile_75 + 1);
451  else
452  header->percentile_100 = header->percentile_75 + 1;
453  }
454 }
static uint16 FloatToUint16(const GlobalHeader &global_header, float value)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ ComputeGlobalHeader()

void ComputeGlobalHeader ( const MatrixBase< Real > &  mat,
CompressionMethod  method,
GlobalHeader header 
)
inlinestaticprivate

Definition at line 57 of file compressed-matrix.cc.

References CompressedMatrix::GlobalHeader::format, KALDI_ASSERT, KALDI_COMPILE_TIME_ASSERT, KALDI_ERR, kaldi::kAutomaticMethod, CompressedMatrix::kOneByte, kaldi::kOneByteAuto, kaldi::kOneByteUnsignedInteger, CompressedMatrix::kOneByteWithColHeaders, kaldi::kOneByteZeroOne, kaldi::kSpeechFeature, CompressedMatrix::kTwoByte, kaldi::kTwoByteAuto, kaldi::kTwoByteSignedInteger, MatrixBase< Real >::Max(), MatrixBase< Real >::Min(), CompressedMatrix::GlobalHeader::min_value, CompressedMatrix::GlobalHeader::num_cols, CompressedMatrix::GlobalHeader::num_rows, MatrixBase< Real >::NumCols(), MatrixBase< Real >::NumRows(), and CompressedMatrix::GlobalHeader::range.

Referenced by CompressedMatrix::CopyFromMat().

59  {
60  if (method == kAutomaticMethod) {
61  if (mat.NumRows() > 8) method = kSpeechFeature;
62  else method = kTwoByteAuto;
63  }
64 
65  switch (method) {
66  case kSpeechFeature:
67  header->format = static_cast<int32>(kOneByteWithColHeaders); // 1.
68  break;
70  header->format = static_cast<int32>(kTwoByte); // 2.
71  break;
73  header->format = static_cast<int32>(kOneByte); // 3.
74  break;
75  default:
76  KALDI_ERR << "Invalid compression type: "
77  << static_cast<int32>(method);
78  }
79 
80  header->num_rows = mat.NumRows();
81  header->num_cols = mat.NumCols();
82 
83  // Now compute 'min_value' and 'range'.
84  switch (method) {
85  case kSpeechFeature: case kTwoByteAuto: case kOneByteAuto: {
86  float min_value = mat.Min(), max_value = mat.Max();
87  // ensure that max_value is strictly greater than min_value, even if matrix is
88  // constant; this avoids crashes in ComputeColHeader when compressing speech
89  // featupres.
90  if (max_value == min_value)
91  max_value = min_value + (1.0 + fabs(min_value));
92  KALDI_ASSERT(min_value - min_value == 0 &&
93  max_value - max_value == 0 &&
94  "Cannot compress a matrix with Nan's or Inf's");
95 
96  header->min_value = min_value;
97  header->range = max_value - min_value;
98 
99  // we previously checked that max_value != min_value, so their
100  // difference should be nonzero.
101  KALDI_ASSERT(header->range > 0.0);
102  break;
103  }
104  case kTwoByteSignedInteger: {
105  header->min_value = -32768.0;
106  header->range = 65535.0;
107  break;
108  }
110  header->min_value = 0.0;
111  header->range = 255.0;
112  break;
113  }
114  case kOneByteZeroOne: {
115  header->min_value = 0.0;
116  header->range = 1.0;
117  break;
118  }
119  default:
120  KALDI_ERR << "Unknown compression method = "
121  << static_cast<int32>(method);
122  }
123  KALDI_COMPILE_TIME_ASSERT(sizeof(*header) == 20); // otherwise
124  // something weird is happening and our code probably won't work or
125  // won't be robust across platforms.
126 }
kaldi::int32 int32
#define KALDI_ERR
Definition: kaldi-error.h:147
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
#define KALDI_COMPILE_TIME_ASSERT(b)
Definition: kaldi-utils.h:131

◆ CopyColToVec()

template void CopyColToVec ( MatrixIndexT  col,
VectorBase< Real > *  v 
) const

Copies column #col of the matrix into vector v.

Note: v must have same size as #rows.

Definition at line 724 of file compressed-matrix.cc.

References CompressedMatrix::CharToFloat(), CompressedMatrix::CopyRowToVec(), VectorBase< Real >::Data(), CompressedMatrix::data_, VectorBase< Real >::Dim(), rnnlm::i, KALDI_ASSERT, CompressedMatrix::kOneByte, CompressedMatrix::kOneByteWithColHeaders, CompressedMatrix::kTwoByte, CompressedMatrix::NumCols(), CompressedMatrix::NumRows(), CompressedMatrix::PerColHeader::percentile_0, CompressedMatrix::PerColHeader::percentile_100, CompressedMatrix::PerColHeader::percentile_25, CompressedMatrix::PerColHeader::percentile_75, and CompressedMatrix::Uint16ToFloat().

Referenced by CompressedMatrix::NumCols(), and kaldi::UnitTestCompressedMatrix().

725  {
726  KALDI_ASSERT(col < this->NumCols());
727  KALDI_ASSERT(col >= 0);
728  KALDI_ASSERT(v->Dim() == this->NumRows());
729 
730  GlobalHeader *h = reinterpret_cast<GlobalHeader*>(data_);
731 
732  DataFormat format = static_cast<DataFormat>(h->format);
733  if (format == kOneByteWithColHeaders) {
734  PerColHeader *per_col_header = reinterpret_cast<PerColHeader*>(h+1);
735  uint8 *byte_data = reinterpret_cast<uint8*>(per_col_header +
736  h->num_cols);
737  byte_data += col*h->num_rows; // point to first value in the column we want
738  per_col_header += col;
739  float p0 = Uint16ToFloat(*h, per_col_header->percentile_0),
740  p25 = Uint16ToFloat(*h, per_col_header->percentile_25),
741  p75 = Uint16ToFloat(*h, per_col_header->percentile_75),
742  p100 = Uint16ToFloat(*h, per_col_header->percentile_100);
743  for (int32 i = 0; i < h->num_rows; i++, byte_data++) {
744  float f = CharToFloat(p0, p25, p75, p100, *byte_data);
745  (*v)(i) = f;
746  }
747  } else if (format == kTwoByte) {
748  int32 num_rows = h->num_rows, num_cols = h->num_cols;
749  float min_value = h->min_value,
750  increment = h->range * (1.0 / 65535.0);
751  const uint16 *col_data = reinterpret_cast<uint16*>(h + 1) + col;
752  Real *v_data = v->Data();
753  for (int32 r = 0; r < num_rows; r++)
754  v_data[r] = min_value + increment * col_data[r * num_cols];
755  } else {
756  KALDI_ASSERT(format == kOneByte);
757  int32 num_rows = h->num_rows, num_cols = h->num_cols;
758  float min_value = h->min_value,
759  increment = h->range * (1.0 / 255.0);
760  const uint8 *col_data = reinterpret_cast<uint8*>(h + 1) + col;
761  Real *v_data = v->Data();
762  for (int32 r = 0; r < num_rows; r++)
763  v_data[r] = min_value + increment * col_data[r * num_cols];
764  }
765 }
kaldi::int32 int32
static float Uint16ToFloat(const GlobalHeader &global_header, uint16 value)
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static float CharToFloat(float p0, float p25, float p75, float p100, uint8 value)
MatrixIndexT NumCols() const
Returns number of columns (or zero for emtpy matrix).

◆ CopyFromMat()

template void CopyFromMat ( const MatrixBase< Real > &  mat,
CompressionMethod  method = kAutomaticMethod 
)

This will resize *this and copy the contents of mat to *this.

Definition at line 129 of file compressed-matrix.cc.

References CompressedMatrix::AllocateData(), CompressedMatrix::CompressColumn(), CompressedMatrix::ComputeGlobalHeader(), MatrixBase< Real >::Data(), CompressedMatrix::data_, CompressedMatrix::DataSize(), CompressedMatrix::FloatToUint16(), CompressedMatrix::FloatToUint8(), CompressedMatrix::GlobalHeader::format, KALDI_ASSERT, CompressedMatrix::kOneByte, CompressedMatrix::kOneByteWithColHeaders, CompressedMatrix::kTwoByte, CompressedMatrix::GlobalHeader::num_cols, CompressedMatrix::GlobalHeader::num_rows, MatrixBase< Real >::NumCols(), MatrixBase< Real >::NumRows(), MatrixBase< Real >::RowData(), and MatrixBase< Real >::Stride().

Referenced by CompressedMatrix::CompressedMatrix(), CompressedAffineXformStats::CopyFromAffineXformStats(), CompressedMatrix::Data(), CompressedMatrix::operator=(), and CompressedMatrix::Read().

130  {
131  if (data_ != NULL) {
132  delete [] static_cast<float*>(data_); // call delete [] because was allocated with new float[]
133  data_ = NULL;
134  }
135  if (mat.NumRows() == 0) { return; } // Zero-size matrix stored as zero pointer.
136 
137 
138  GlobalHeader global_header;
139  ComputeGlobalHeader(mat, method, &global_header);
140 
141  int32 data_size = DataSize(global_header);
142 
143  data_ = AllocateData(data_size);
144 
145  *(reinterpret_cast<GlobalHeader*>(data_)) = global_header;
146 
147  DataFormat format = static_cast<DataFormat>(global_header.format);
148  if (format == kOneByteWithColHeaders) {
149  PerColHeader *header_data =
150  reinterpret_cast<PerColHeader*>(static_cast<char*>(data_) +
151  sizeof(GlobalHeader));
152  uint8 *byte_data =
153  reinterpret_cast<uint8*>(header_data + global_header.num_cols);
154 
155  const Real *matrix_data = mat.Data();
156 
157  for (int32 col = 0; col < global_header.num_cols; col++) {
158  CompressColumn(global_header,
159  matrix_data + col, mat.Stride(),
160  global_header.num_rows,
161  header_data, byte_data);
162  header_data++;
163  byte_data += global_header.num_rows;
164  }
165  } else if (format == kTwoByte) {
166  uint16 *data = reinterpret_cast<uint16*>(static_cast<char*>(data_) +
167  sizeof(GlobalHeader));
168  int32 num_rows = mat.NumRows(), num_cols = mat.NumCols();
169  for (int32 r = 0; r < num_rows; r++) {
170  const Real *row_data = mat.RowData(r);
171  for (int32 c = 0; c < num_cols; c++)
172  data[c] = FloatToUint16(global_header, row_data[c]);
173  data += num_cols;
174  }
175  } else {
176  KALDI_ASSERT(format == kOneByte);
177  uint8 *data = reinterpret_cast<uint8*>(static_cast<char*>(data_) +
178  sizeof(GlobalHeader));
179  int32 num_rows = mat.NumRows(), num_cols = mat.NumCols();
180  for (int32 r = 0; r < num_rows; r++) {
181  const Real *row_data = mat.RowData(r);
182  for (int32 c = 0; c < num_cols; c++)
183  data[c] = FloatToUint8(global_header, row_data[c]);
184  data += num_cols;
185  }
186  }
187 }
kaldi::int32 int32
static uint16 FloatToUint16(const GlobalHeader &global_header, float value)
static void ComputeGlobalHeader(const MatrixBase< Real > &mat, CompressionMethod method, GlobalHeader *header)
static uint8 FloatToUint8(const GlobalHeader &global_header, float value)
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static void CompressColumn(const GlobalHeader &global_header, const Real *data, MatrixIndexT stride, int32 num_rows, PerColHeader *header, uint8 *byte_data)
static void * AllocateData(int32 num_bytes)
static MatrixIndexT DataSize(const GlobalHeader &header)

◆ CopyRowToVec()

template void CopyRowToVec ( MatrixIndexT  row,
VectorBase< Real > *  v 
) const

Copies row #row of the matrix into vector v.

Note: v must have same size as #cols.

Definition at line 681 of file compressed-matrix.cc.

References CompressedMatrix::CharToFloat(), VectorBase< Real >::Data(), CompressedMatrix::data_, VectorBase< Real >::Dim(), rnnlm::i, KALDI_ASSERT, CompressedMatrix::kOneByte, CompressedMatrix::kOneByteWithColHeaders, CompressedMatrix::kTwoByte, CompressedMatrix::NumCols(), CompressedMatrix::NumRows(), and CompressedMatrix::Uint16ToFloat().

Referenced by CompressedMatrix::CopyColToVec(), kaldi::FilterCompressedMatrixRows(), CompressedMatrix::NumCols(), and kaldi::UnitTestCompressedMatrix().

682  {
683  KALDI_ASSERT(row < this->NumRows());
684  KALDI_ASSERT(row >= 0);
685  KALDI_ASSERT(v->Dim() == this->NumCols());
686 
687  GlobalHeader *h = reinterpret_cast<GlobalHeader*>(data_);
688  DataFormat format = static_cast<DataFormat>(h->format);
689  if (format == kOneByteWithColHeaders) {
690  PerColHeader *per_col_header = reinterpret_cast<PerColHeader*>(h+1);
691  uint8 *byte_data = reinterpret_cast<uint8*>(per_col_header +
692  h->num_cols);
693  byte_data += row; // point to first value we are interested in
694  for (int32 i = 0; i < h->num_cols;
695  i++, per_col_header++, byte_data += h->num_rows) {
696  float p0 = Uint16ToFloat(*h, per_col_header->percentile_0),
697  p25 = Uint16ToFloat(*h, per_col_header->percentile_25),
698  p75 = Uint16ToFloat(*h, per_col_header->percentile_75),
699  p100 = Uint16ToFloat(*h, per_col_header->percentile_100);
700  float f = CharToFloat(p0, p25, p75, p100, *byte_data);
701  (*v)(i) = f;
702  }
703  } else if (format == kTwoByte) {
704  int32 num_cols = h->num_cols;
705  float min_value = h->min_value,
706  increment = h->range * (1.0 / 65535.0);
707  const uint16 *row_data = reinterpret_cast<uint16*>(h + 1) + (num_cols * row);
708  Real *v_data = v->Data();
709  for (int32 c = 0; c < num_cols; c++)
710  v_data[c] = min_value + row_data[c] * increment;
711  } else {
712  KALDI_ASSERT(format == kOneByte);
713  int32 num_cols = h->num_cols;
714  float min_value = h->min_value,
715  increment = h->range * (1.0 / 255.0);
716  const uint8 *row_data = reinterpret_cast<uint8*>(h + 1) + (num_cols * row);
717  Real *v_data = v->Data();
718  for (int32 c = 0; c < num_cols; c++)
719  v_data[c] = min_value + row_data[c] * increment;
720  }
721 }
kaldi::int32 int32
static float Uint16ToFloat(const GlobalHeader &global_header, uint16 value)
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static float CharToFloat(float p0, float p25, float p75, float p100, uint8 value)
MatrixIndexT NumCols() const
Returns number of columns (or zero for emtpy matrix).

◆ CopyToMat() [1/2]

void CopyToMat ( MatrixBase< Real > *  mat,
MatrixTransposeType  trans = kNoTrans 
) const

Copies contents to matrix.

Note: mat must have the correct size. The kTrans case uses a temporary.

Definition at line 614 of file compressed-matrix.cc.

References CompressedMatrix::CharToFloat(), MatrixBase< Real >::CopyFromMat(), CompressedMatrix::data_, CompressedMatrix::GlobalHeader::format, rnnlm::i, rnnlm::j, KALDI_ASSERT, kaldi::kNoTrans, CompressedMatrix::kOneByte, CompressedMatrix::kOneByteWithColHeaders, kaldi::kTrans, CompressedMatrix::kTwoByte, CompressedMatrix::GlobalHeader::min_value, CompressedMatrix::GlobalHeader::num_cols, CompressedMatrix::GlobalHeader::num_rows, MatrixBase< Real >::NumCols(), CompressedMatrix::NumCols(), MatrixBase< Real >::NumRows(), CompressedMatrix::NumRows(), CompressedMatrix::PerColHeader::percentile_0, CompressedMatrix::PerColHeader::percentile_100, CompressedMatrix::PerColHeader::percentile_25, CompressedMatrix::PerColHeader::percentile_75, CompressedMatrix::GlobalHeader::range, MatrixBase< Real >::RowData(), and CompressedMatrix::Uint16ToFloat().

Referenced by CompressedMatrix::CompressedMatrix(), MatrixBase< float >::CopyFromMat(), CompressedAffineXformStats::CopyToAffineXformStats(), CompressedMatrix::CopyToMat(), CompressedMatrix::Data(), kaldi::ExtractObjectRange(), kaldi::FilterCompressedMatrixRows(), Matrix< BaseFloat >::Matrix(), CompressedMatrix::NumCols(), Matrix< BaseFloat >::Read(), kaldi::UnitTestCompressedMatrix(), kaldi::UnitTestExtractCompressedMatrix(), and CompressedMatrix::Write().

615  {
616  if (trans == kTrans) {
617  Matrix<Real> temp(this->NumCols(), this->NumRows());
618  CopyToMat(&temp, kNoTrans);
619  mat->CopyFromMat(temp, kTrans);
620  return;
621  }
622 
623  if (data_ == NULL) {
624  KALDI_ASSERT(mat->NumRows() == 0);
625  KALDI_ASSERT(mat->NumCols() == 0);
626  return;
627  }
628  GlobalHeader *h = reinterpret_cast<GlobalHeader*>(data_);
629  int32 num_cols = h->num_cols, num_rows = h->num_rows;
630  KALDI_ASSERT(mat->NumRows() == num_rows);
631  KALDI_ASSERT(mat->NumCols() == num_cols);
632 
633  DataFormat format = static_cast<DataFormat>(h->format);
634  if (format == kOneByteWithColHeaders) {
635  PerColHeader *per_col_header = reinterpret_cast<PerColHeader*>(h+1);
636  uint8 *byte_data = reinterpret_cast<uint8*>(per_col_header +
637  h->num_cols);
638  for (int32 i = 0; i < num_cols; i++, per_col_header++) {
639  float p0 = Uint16ToFloat(*h, per_col_header->percentile_0),
640  p25 = Uint16ToFloat(*h, per_col_header->percentile_25),
641  p75 = Uint16ToFloat(*h, per_col_header->percentile_75),
642  p100 = Uint16ToFloat(*h, per_col_header->percentile_100);
643  for (int32 j = 0; j < num_rows; j++, byte_data++) {
644  float f = CharToFloat(p0, p25, p75, p100, *byte_data);
645  (*mat)(j, i) = f;
646  }
647  }
648  } else if (format == kTwoByte) {
649  const uint16 *data = reinterpret_cast<const uint16*>(h + 1);
650  float min_value = h->min_value,
651  increment = h->range * (1.0 / 65535.0);
652  for (int32 i = 0; i < num_rows; i++) {
653  Real *row_data = mat->RowData(i);
654  for (int32 j = 0; j < num_cols; j++)
655  row_data[j] = min_value + data[j] * increment;
656  data += num_cols;
657  }
658  } else {
659  KALDI_ASSERT(format == kOneByte);
660  float min_value = h->min_value, increment = h->range * (1.0 / 255.0);
661 
662  const uint8 *data = reinterpret_cast<const uint8*>(h + 1);
663  for (int32 i = 0; i < num_rows; i++) {
664  Real *row_data = mat->RowData(i);
665  for (int32 j = 0; j < num_cols; j++)
666  row_data[j] = min_value + data[j] * increment;
667  data += num_cols;
668  }
669  }
670 }
kaldi::int32 int32
static float Uint16ToFloat(const GlobalHeader &global_header, uint16 value)
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static float CharToFloat(float p0, float p25, float p75, float p100, uint8 value)
MatrixIndexT NumCols() const
Returns number of columns (or zero for emtpy matrix).
void CopyToMat(MatrixBase< Real > *mat, MatrixTransposeType trans=kNoTrans) const
Copies contents to matrix.

◆ CopyToMat() [2/2]

void CopyToMat ( int32  row_offset,
int32  column_offset,
MatrixBase< Real > *  dest 
) const

Copies submatrix of compressed matrix into matrix dest.

Submatrix starts at row row_offset and column column_offset and its size is defined by size of provided matrix dest

Definition at line 778 of file compressed-matrix.cc.

References CompressedMatrix::CharToFloat(), CompressedMatrix::CopyToMat(), CompressedMatrix::data_, rnnlm::i, rnnlm::j, KALDI_ASSERT, KALDI_PARANOID_ASSERT, CompressedMatrix::kOneByte, CompressedMatrix::kOneByteWithColHeaders, CompressedMatrix::kTwoByte, CompressedMatrix::GlobalHeader::num_rows, MatrixBase< Real >::NumCols(), CompressedMatrix::NumCols(), MatrixBase< Real >::NumRows(), CompressedMatrix::NumRows(), MatrixBase< Real >::RowData(), and CompressedMatrix::Uint16ToFloat().

780  {
781  KALDI_PARANOID_ASSERT(row_offset < this->NumRows());
782  KALDI_PARANOID_ASSERT(col_offset < this->NumCols());
783  KALDI_PARANOID_ASSERT(row_offset >= 0);
784  KALDI_PARANOID_ASSERT(col_offset >= 0);
785  KALDI_ASSERT(row_offset+dest->NumRows() <= this->NumRows());
786  KALDI_ASSERT(col_offset+dest->NumCols() <= this->NumCols());
787  // everything is OK
788  GlobalHeader *h = reinterpret_cast<GlobalHeader*>(data_);
789  int32 num_rows = h->num_rows, num_cols = h->num_cols,
790  tgt_cols = dest->NumCols(), tgt_rows = dest->NumRows();
791 
792  DataFormat format = static_cast<DataFormat>(h->format);
793  if (format == kOneByteWithColHeaders) {
794  PerColHeader *per_col_header = reinterpret_cast<PerColHeader*>(h+1);
795  uint8 *byte_data = reinterpret_cast<uint8*>(per_col_header +
796  h->num_cols);
797 
798  uint8 *start_of_subcol = byte_data+row_offset; // skip appropriate
799  // number of columns
800  start_of_subcol += col_offset*num_rows; // skip appropriate number of rows
801 
802  per_col_header += col_offset; // skip the appropriate number of headers
803 
804  for (int32 i = 0;
805  i < tgt_cols;
806  i++, per_col_header++, start_of_subcol+=num_rows) {
807  byte_data = start_of_subcol;
808  float p0 = Uint16ToFloat(*h, per_col_header->percentile_0),
809  p25 = Uint16ToFloat(*h, per_col_header->percentile_25),
810  p75 = Uint16ToFloat(*h, per_col_header->percentile_75),
811  p100 = Uint16ToFloat(*h, per_col_header->percentile_100);
812  for (int32 j = 0; j < tgt_rows; j++, byte_data++) {
813  float f = CharToFloat(p0, p25, p75, p100, *byte_data);
814  (*dest)(j, i) = f;
815  }
816  }
817  } else if (format == kTwoByte) {
818  const uint16 *data = reinterpret_cast<const uint16*>(h+1) + col_offset +
819  (num_cols * row_offset);
820  float min_value = h->min_value,
821  increment = h->range * (1.0 / 65535.0);
822 
823  for (int32 row = 0; row < tgt_rows; row++) {
824  Real *dest_row = dest->RowData(row);
825  for (int32 col = 0; col < tgt_cols; col++)
826  dest_row[col] = min_value + increment * data[col];
827  data += num_cols;
828  }
829  } else {
830  KALDI_ASSERT(format == kOneByte);
831  const uint8 *data = reinterpret_cast<const uint8*>(h+1) + col_offset +
832  (num_cols * row_offset);
833  float min_value = h->min_value,
834  increment = h->range * (1.0 / 255.0);
835  for (int32 row = 0; row < tgt_rows; row++) {
836  Real *dest_row = dest->RowData(row);
837  for (int32 col = 0; col < tgt_cols; col++)
838  dest_row[col] = min_value + increment * data[col];
839  data += num_cols;
840  }
841  }
842 }
kaldi::int32 int32
#define KALDI_PARANOID_ASSERT(cond)
Definition: kaldi-error.h:206
static float Uint16ToFloat(const GlobalHeader &global_header, uint16 value)
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
static float CharToFloat(float p0, float p25, float p75, float p100, uint8 value)
MatrixIndexT NumCols() const
Returns number of columns (or zero for emtpy matrix).

◆ Data()

◆ DataSize()

MatrixIndexT DataSize ( const GlobalHeader header)
staticprivate

Definition at line 28 of file compressed-matrix.cc.

References CompressedMatrix::GlobalHeader::format, KALDI_ASSERT, CompressedMatrix::kOneByte, CompressedMatrix::kOneByteWithColHeaders, CompressedMatrix::kTwoByte, CompressedMatrix::GlobalHeader::num_cols, and CompressedMatrix::GlobalHeader::num_rows.

Referenced by CompressedMatrix::CompressedMatrix(), CompressedMatrix::CopyFromMat(), CompressedMatrix::operator=(), CompressedMatrix::Read(), and CompressedMatrix::Write().

28  {
29  // Returns size in bytes of the data.
30  DataFormat format = static_cast<DataFormat>(header.format);
31  if (format == kOneByteWithColHeaders) {
32  return sizeof(GlobalHeader) +
33  header.num_cols * (sizeof(PerColHeader) + header.num_rows);
34  } else if (format == kTwoByte) {
35  return sizeof(GlobalHeader) +
36  2 * header.num_rows * header.num_cols;
37  } else {
38  KALDI_ASSERT(format == kOneByte);
39  return sizeof(GlobalHeader) +
40  header.num_rows * header.num_cols;
41  }
42 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ FloatToChar()

uint8 FloatToChar ( float  p0,
float  p25,
float  p75,
float  p100,
float  value 
)
inlinestaticprivate

Definition at line 457 of file compressed-matrix.cc.

Referenced by CompressedMatrix::CompressColumn().

459  {
460  int ans;
461  if (value < p25) { // range [ p0, p25 ) covered by
462  // characters 0 .. 64. We round to the closest int.
463  float f = (value - p0) / (p25 - p0);
464  ans = static_cast<int>(f * 64 + 0.5);
465  // Note: the checks on the next two lines
466  // are necessary in pathological cases when all the elements in a row
467  // are the same and the percentile_* values are separated by one.
468  if (ans < 0) ans = 0;
469  if (ans > 64) ans = 64;
470  } else if (value < p75) { // range [ p25, p75 )covered
471  // by characters 64 .. 192. We round to the closest int.
472  float f = (value - p25) / (p75 - p25);
473  ans = 64 + static_cast<int>(f * 128 + 0.5);
474  if (ans < 64) ans = 64;
475  if (ans > 192) ans = 192;
476  } else { // range [ p75, p100 ] covered by
477  // characters 192 .. 255. Note: this last range
478  // has fewer characters than the left range, because
479  // we go up to 255, not 256.
480  float f = (value - p75) / (p100 - p75);
481  ans = 192 + static_cast<int>(f * 63 + 0.5);
482  if (ans < 192) ans = 192;
483  if (ans > 255) ans = 255;
484  }
485  return static_cast<uint8>(ans);
486 }

◆ FloatToUint16()

uint16 FloatToUint16 ( const GlobalHeader global_header,
float  value 
)
inlinestaticprivate

Definition at line 347 of file compressed-matrix.cc.

References CompressedMatrix::GlobalHeader::min_value, and CompressedMatrix::GlobalHeader::range.

Referenced by CompressedMatrix::ComputeColHeader(), and CompressedMatrix::CopyFromMat().

349  {
350  float f = (value - global_header.min_value) /
351  global_header.range;
352  if (f > 1.0) f = 1.0; // Note: this should not happen.
353  if (f < 0.0) f = 0.0; // Note: this should not happen.
354  return static_cast<int>(f * 65535 + 0.499); // + 0.499 is to
355  // round to closest int; avoids bias.
356 }

◆ FloatToUint8()

uint8 FloatToUint8 ( const GlobalHeader global_header,
float  value 
)
inlinestaticprivate

Definition at line 359 of file compressed-matrix.cc.

References CompressedMatrix::GlobalHeader::min_value, and CompressedMatrix::GlobalHeader::range.

Referenced by CompressedMatrix::CopyFromMat().

361  {
362  float f = (value - global_header.min_value) /
363  global_header.range;
364  if (f > 1.0) f = 1.0; // Note: this should not happen.
365  if (f < 0.0) f = 0.0; // Note: this should not happen.
366  return static_cast<int>(f * 255 + 0.499); // + 0.499 is to
367  // round to closest int; avoids bias.
368 }

◆ NumCols()

◆ NumRows()

◆ operator=() [1/2]

CompressedMatrix & operator= ( const CompressedMatrix mat)

Definition at line 863 of file compressed-matrix.cc.

References CompressedMatrix::AllocateData(), CompressedMatrix::Clear(), CompressedMatrix::data_, and CompressedMatrix::DataSize().

Referenced by CompressedMatrix::Data(), and CompressedMatrix::operator=().

863  {
864  Clear(); // now this->data_ == NULL.
865  if (mat.data_ != NULL) {
866  MatrixIndexT data_size = DataSize(*static_cast<GlobalHeader*>(mat.data_));
867  data_ = AllocateData(data_size);
868  memcpy(static_cast<void*>(data_),
869  static_cast<void*>(mat.data_),
870  data_size);
871  }
872  return *this;
873 }
int32 MatrixIndexT
Definition: matrix-common.h:98
static void * AllocateData(int32 num_bytes)
static MatrixIndexT DataSize(const GlobalHeader &header)

◆ operator=() [2/2]

CompressedMatrix & operator= ( const MatrixBase< Real > &  mat)

Definition at line 335 of file compressed-matrix.cc.

References CompressedMatrix::CopyFromMat(), and CompressedMatrix::operator=().

335  {
336  this->CopyFromMat(mat);
337  return *this;
338 }
void CopyFromMat(const MatrixBase< Real > &mat, CompressionMethod method=kAutomaticMethod)
This will resize *this and copy the contents of mat to *this.

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)

Definition at line 566 of file compressed-matrix.cc.

References CompressedMatrix::AllocateData(), CompressedMatrix::CopyFromMat(), CompressedMatrix::data_, CompressedMatrix::DataSize(), CompressedMatrix::GlobalHeader::format, KALDI_ERR, CompressedMatrix::GlobalHeader::num_cols, kaldi::Peek(), Matrix< Real >::Read(), and kaldi::ReadToken().

Referenced by CompressedMatrix::Data(), CompressedAffineXformStats::Read(), NnetExample::Read(), DiscriminativeNnetExample::Read(), Matrix< BaseFloat >::Read(), and kaldi::UnitTestCompressedMatrix().

566  {
567  if (data_ != NULL) {
568  delete [] (static_cast<float*>(data_));
569  data_ = NULL;
570  }
571  if (binary) {
572  int peekval = Peek(is, binary);
573  if (peekval == 'C') {
574  std::string tok; // Should be CM (format 1) or CM2 (format 2)
575  ReadToken(is, binary, &tok);
576  GlobalHeader h;
577  if (tok == "CM") { h.format = 1; } // kOneByteWithColHeaders
578  else if (tok == "CM2") { h.format = 2; } // kTwoByte
579  else if (tok == "CM3") { h.format = 3; } // kOneByte
580  else {
581  KALDI_ERR << "Unexpected token " << tok << ", expecting CM, CM2 or CM3";
582  }
583  // don't read the "format" -> hence + 4, - 4.
584  is.read(reinterpret_cast<char*>(&h) + 4, sizeof(h) - 4);
585  if (is.fail())
586  KALDI_ERR << "Failed to read header";
587  if (h.num_cols == 0) // empty matrix.
588  return;
589  int32 size = DataSize(h), remaining_size = size - sizeof(GlobalHeader);
590  data_ = AllocateData(size);
591  *(reinterpret_cast<GlobalHeader*>(data_)) = h;
592  is.read(reinterpret_cast<char*>(data_) + sizeof(GlobalHeader),
593  remaining_size);
594  } else {
595  // Assume that what we're reading is a regular Matrix. This might be the
596  // case if you changed your code, making a Matrix into a CompressedMatrix,
597  // and you want back-compatibility for reading.
598  Matrix<BaseFloat> M;
599  M.Read(is, binary); // This will crash if it was not a Matrix.
600  this->CopyFromMat(M);
601  }
602  } else { // Text-mode read. In this case you don't get to
603  // choose the compression type. Anyway this branch would only
604  // be taken when debugging.
605  Matrix<BaseFloat> temp;
606  temp.Read(is, binary);
607  this->CopyFromMat(temp);
608  }
609  if (is.fail())
610  KALDI_ERR << "Failed to read data.";
611 }
kaldi::int32 int32
void ReadToken(std::istream &is, bool binary, std::string *str)
ReadToken gets the next token and puts it in str (exception on failure).
Definition: io-funcs.cc:154
int Peek(std::istream &is, bool binary)
Peek consumes whitespace (if binary == false) and then returns the peek() value of the stream...
Definition: io-funcs.cc:145
#define KALDI_ERR
Definition: kaldi-error.h:147
static void * AllocateData(int32 num_bytes)
void CopyFromMat(const MatrixBase< Real > &mat, CompressionMethod method=kAutomaticMethod)
This will resize *this and copy the contents of mat to *this.
static MatrixIndexT DataSize(const GlobalHeader &header)

◆ Scale()

void Scale ( float  alpha)

scales all elements of matrix by alpha.

It scales the floating point values in GlobalHeader by alpha.

Definition at line 46 of file compressed-matrix.cc.

References CompressedMatrix::data_, CompressedMatrix::GlobalHeader::min_value, and CompressedMatrix::GlobalHeader::range.

Referenced by CompressedMatrix::Swap(), and kaldi::UnitTestCompressedMatrix().

46  {
47  if (data_ != NULL) {
48  GlobalHeader *h = reinterpret_cast<GlobalHeader*>(data_);
49  // scale the floating point values in each PerColHolder
50  // and leave all integers the same.
51  h->min_value *= alpha;
52  h->range *= alpha;
53  }
54 }

◆ Swap()

void Swap ( CompressedMatrix other)
inline

Definition at line 171 of file compressed-matrix.h.

References CompressedMatrix::Clear(), CompressedMatrix::data_, CompressedMatrix::Scale(), and kaldi::swap().

Referenced by CompressedMatrix::CompressedMatrix(), NnetExample::NnetExample(), and GeneralMatrix::SwapCompressedMatrix().

171 { std::swap(data_, other->data_); }
void swap(basic_filebuf< CharT, Traits > &x, basic_filebuf< CharT, Traits > &y)

◆ Uint16ToFloat()

float Uint16ToFloat ( const GlobalHeader global_header,
uint16  value 
)
inlinestaticprivate

Definition at line 371 of file compressed-matrix.cc.

References CompressedMatrix::GlobalHeader::min_value, and CompressedMatrix::GlobalHeader::range.

Referenced by CompressedMatrix::CompressColumn(), CompressedMatrix::CopyColToVec(), CompressedMatrix::CopyRowToVec(), and CompressedMatrix::CopyToMat().

373  {
374  // the constant 1.52590218966964e-05 is 1/65535.
375  return global_header.min_value
376  + global_header.range * 1.52590218966964e-05F * value;
377 }

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const

Definition at line 531 of file compressed-matrix.cc.

References CompressedMatrix::CopyToMat(), CompressedMatrix::data_, CompressedMatrix::DataSize(), CompressedMatrix::GlobalHeader::format, KALDI_ERR, CompressedMatrix::kOneByte, CompressedMatrix::kOneByteWithColHeaders, CompressedMatrix::kTwoByte, kaldi::kUndefined, CompressedMatrix::GlobalHeader::min_value, CompressedMatrix::GlobalHeader::num_cols, CompressedMatrix::GlobalHeader::num_rows, CompressedMatrix::NumCols(), CompressedMatrix::NumRows(), CompressedMatrix::GlobalHeader::range, MatrixBase< Real >::Write(), and kaldi::WriteToken().

Referenced by CompressedMatrix::Data(), kaldi::UnitTestCompressedMatrix(), CompressedAffineXformStats::Write(), NnetExample::Write(), and DiscriminativeNnetExample::Write().

531  {
532  if (binary) { // Binary-mode write:
533  if (data_ != NULL) {
534  GlobalHeader &h = *reinterpret_cast<GlobalHeader*>(data_);
535  DataFormat format = static_cast<DataFormat>(h.format);
536  if (format == kOneByteWithColHeaders) {
537  WriteToken(os, binary, "CM");
538  } else if (format == kTwoByte) {
539  WriteToken(os, binary, "CM2");
540  } else if (format == kOneByte) {
541  WriteToken(os, binary, "CM3");
542  }
543  MatrixIndexT size = DataSize(h); // total size of data in data_
544  // We don't write out the "int32 format", hence the + 4, - 4.
545  os.write(reinterpret_cast<const char*>(data_) + 4, size - 4);
546  } else { // special case: where data_ == NULL, we treat it as an empty
547  // matrix.
548  WriteToken(os, binary, "CM");
549  GlobalHeader h;
550  h.range = h.min_value = 0.0;
551  h.num_rows = h.num_cols = 0;
552  os.write(reinterpret_cast<const char*>(&h), sizeof(h));
553  }
554  } else {
555  // In text mode, just use the same format as a regular matrix.
556  // This is not compressed.
557  Matrix<BaseFloat> temp_mat(this->NumRows(), this->NumCols(),
558  kUndefined);
559  this->CopyToMat(&temp_mat);
560  temp_mat.Write(os, binary);
561  }
562  if (os.fail())
563  KALDI_ERR << "Error writing compressed matrix to stream.";
564 }
int32 MatrixIndexT
Definition: matrix-common.h:98
#define KALDI_ERR
Definition: kaldi-error.h:147
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
MatrixIndexT NumRows() const
Returns number of rows (or zero for emtpy matrix).
MatrixIndexT NumCols() const
Returns number of columns (or zero for emtpy matrix).
void CopyToMat(MatrixBase< Real > *mat, MatrixTransposeType trans=kNoTrans) const
Copies contents to matrix.
static MatrixIndexT DataSize(const GlobalHeader &header)

Friends And Related Function Documentation

◆ Matrix< double >

friend class Matrix< double >
friend

Definition at line 180 of file compressed-matrix.h.

◆ Matrix< float >

friend class Matrix< float >
friend

Definition at line 179 of file compressed-matrix.h.

Member Data Documentation

◆ data_


The documentation for this class was generated from the following files: