All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
"Classes for opening streams"

This group contains the Input and Output classes, which are provided to open streams for reading and writing in Kaldi code; for an explanation of how this fits into the bigger picture of Kaldi I/O, see How to open files in Kaldi. More...

Classes

class  Output
 
class  Input
 

Enumerations

enum  OutputType { kNoOutput, kFileOutput, kStandardOutput, kPipeOutput }
 
enum  InputType {
  kNoInput, kFileInput, kStandardInput, kOffsetFileInput,
  kPipeInput
}
 

Functions

OutputType ClassifyWxfilename (const std::string &wxfilename)
 ClassifyWxfilename interprets filenames as follows: More...
 
InputType ClassifyRxfilename (const std::string &rxfilename)
 ClassifyRxfilenames interprets filenames for reading as follows: More...
 
template<class C >
void ReadKaldiObject (const std::string &filename, C *c)
 
template<>
void ReadKaldiObject (const std::string &filename, Matrix< float > *m)
 
template<>
void ReadKaldiObject (const std::string &filename, Matrix< double > *m)
 
template<class C >
void WriteKaldiObject (const C &c, const std::string &filename, bool binary)
 
std::string PrintableRxfilename (std::string rxfilename)
 PrintableRxfilename turns the rxfilename into a more human-readable form for error reporting, i.e. More...
 
std::string PrintableWxfilename (std::string wxfilename)
 PrintableWxfilename turns the filename into a more human-readable form for error reporting, i.e. More...
 

Detailed Description

This group contains the Input and Output classes, which are provided to open streams for reading and writing in Kaldi code; for an explanation of how this fits into the bigger picture of Kaldi I/O, see How to open files in Kaldi.

Enumeration Type Documentation

enum InputType
Enumerator
kNoInput 
kFileInput 
kStandardInput 
kOffsetFileInput 
kPipeInput 

Definition at line 105 of file kaldi-io.h.

enum OutputType
Enumerator
kNoOutput 
kFileOutput 
kStandardOutput 
kPipeOutput 

Definition at line 89 of file kaldi-io.h.

Function Documentation

InputType ClassifyRxfilename ( const std::string &  rxfilename)

ClassifyRxfilenames interprets filenames for reading as follows:

  • kNoInput: invalid filenames (leading or trailing space, things that look like wspecifiers and rspecifiers or pipes to write to with trailing |.
  • kFileInput: normal filenames
  • kStandardInput: the empty string or "-"
  • kPipeInput: e.g. "| gzip -c > blah.gz"
  • kOffsetFileInput: offsets into files, e.g. /some/filename:12970

Definition at line 130 of file kaldi-io.cc.

References rnnlm::d, KALDI_WARN, kaldi::kFileInput, kaldi::kNoInput, kaldi::kOffsetFileInput, kaldi::kPipeInput, and kaldi::kStandardInput.

Referenced by Input::OpenInternal(), and kaldi::UnitTestClassifyRxfilename().

130  {
131  const char *c = filename.c_str();
132  if (*c == '\0' || (*c == '-' && c[1] == '\0')) return kStandardInput; // ""
133  // or "-".
134  else if (*c == '|') return kNoInput; // An output pipe like "|blah": not
135  // valid for input.
136  else if (isspace(*c) || isspace(c[filename.length()-1])) return kNoInput; //
137  // Leading or trailing space.
138  else if ((*c == 't'||*c == 'b') && c[1] == ',') {
139  // We have detected that the user has supplied a wspecifier
140  // or rspecifier (as in kaldi-table.h) where a wxfilename was
141  // needed. Since this is almost certain not to be a real filename
142  // (and would cause a lot of confusion if it were a real filename), we
143  // refuse to deal with it upfront.
144  return kNoInput;
145  } else {
146  const char *d = c;
147  while (d[1] != '\0') d++; // go to last char.
148  if (*d == '|') return kPipeInput; // an input pipe.
149  if (isspace(*d)) return kNoInput; // trailing space which is never valid.
150  else if (isdigit(*d)) {
151  // OK, it could be an offset into a file
152  // which is not allowed.
153  while (isdigit(*d) && d > c) d--;
154  if (*d == ':') return kOffsetFileInput; // Filename is like
155  // some_file:12345
156  else
157  return kFileInput;
158  } else {
159  // at this point it matched no other pattern so we assume a filename, but
160  // we check for '|' as it's a common source of errors to have pipe
161  // commands without the pipe in the right place. Say that it can't be
162  // classified in this case.
163  if (strchr(c, '|') != NULL) {
164  KALDI_WARN << "Trying to classify rxfilename with pipe symbol in the"
165  " wrong place (pipe without | at the end?): " << filename;
166  return kNoInput;
167  }
168  return kFileInput; // matched no other pattern: assume it's an actual
169  // filename.
170  }
171  }
172 }
#define KALDI_WARN
Definition: kaldi-error.h:130
OutputType ClassifyWxfilename ( const std::string &  wxfilename)

ClassifyWxfilename interprets filenames as follows:

  • kNoOutput: invalid filenames (leading or trailing space, things that look like wspecifiers and rspecifiers or like pipes to read from with leading |.
  • kFileOutput: Normal filenames
  • kStandardOutput: The empty string or "-", interpreted as standard output
  • kPipeOutput: pipes, e.g. "gunzip -c some_file.gz |"

Definition at line 82 of file kaldi-io.cc.

References rnnlm::d, KALDI_WARN, kaldi::kFileOutput, kaldi::kNoOutput, kaldi::kPipeOutput, and kaldi::kStandardOutput.

Referenced by Output::Open(), TableWriterBothImpl< Holder >::Open(), kaldi::UnitTestClassifyWxfilename(), and Output::~Output().

82  {
83  const char *c = filename.c_str();
84  if (*c == '\0' || (*c == '-' && c[1] == '\0')) return kStandardOutput; // ""
85  // or "-".
86  else if (*c == '|') return kPipeOutput; // An output pipe like "|blah".
87  else if (isspace(*c) || isspace(c[filename.length()-1])) return kNoOutput; //
88  // Leading or trailing space: can't interpret this.
89  else if ((*c == 't'||*c == 'b') && c[1] == ',') {
90  // We have detected that the user has supplied a wspecifier
91  // or rspecifier (as in kaldi-table.h) where a wxfilename was
92  // needed. Since this is almost certain not to be a real filename
93  // (and would cause confusion if it were a real filename), we
94  // refuse to deal with it.
95  return kNoOutput;
96  } else {
97  const char *d = c;
98  while (d[1] != '\0') d++; // go to last char.
99  if (*d == '|' || isspace(*d)) return kNoOutput; // An input pipe (not
100  // allowed in this context) or trailing space which is just wrong.
101  else if (isdigit(*d)) {
102  // OK, it could be a file, but we have to see if it's an offset into a
103  // file, which is not allowed.
104  while (isdigit(*d) && d > c) d--;
105  if (*d == ':') return kNoOutput; // Filename is like some_file:12345;
106  // not allowed,
107  else
108  return kFileOutput;
109  // as we cannot write to an offset into a file (and if we interpreted it
110  // as an actual filename, the reading code would misinterpret it as an
111  // offset.
112  } else {
113  // at this point it matched no other pattern so we assume a filename, but
114  // we check for '|' as it's a common source of errors to have pipe
115  // commands without the pipe in the right place. Say that it can't be
116  // classified.
117  if (strchr(c, '|') != NULL) {
118  KALDI_WARN << "Trying to classify wxfilename with pipe symbol in the"
119  " wrong place (pipe without | at the beginning?): " <<
120  filename;
121  return kNoOutput;
122  }
123  return kFileOutput; // matched no other pattern: assume it's an actual
124  // filename.
125  }
126  }
127 }
#define KALDI_WARN
Definition: kaldi-error.h:130
std::string PrintableRxfilename ( std::string  rxfilename)

PrintableRxfilename turns the rxfilename into a more human-readable form for error reporting, i.e.

it does quoting and escaping and replaces "" or "-" with "standard input".

Definition at line 58 of file kaldi-io.cc.

References ParseOptions::Escape().

Referenced by SequentialTableReaderArchiveImpl< Holder >::Close(), SequentialTableReaderScriptImpl< Holder >::EnsureObjectLoaded(), RandomAccessTableReaderSortedArchiveImpl< Holder >::FindKeyInternal(), kaldi::GetUtterancePairs(), RandomAccessTableReaderMapped< Holder >::HasKey(), RandomAccessTableReaderScriptImpl< Holder >::HasKeyInternal(), Input::Input(), main(), SequentialTableReaderArchiveImpl< Holder >::Next(), SequentialTableReaderScriptImpl< Holder >::Open(), PipeInputImpl::Open(), SequentialTableReaderArchiveImpl< Holder >::Open(), TableWriterScriptImpl< Holder >::Open(), RandomAccessTableReaderScriptImpl< Holder >::Open(), RandomAccessTableReaderArchiveImplBase< Holder >::Open(), Input::OpenInternal(), fst::ReadFstKaldi(), RandomAccessTableReaderArchiveImplBase< Holder >::ReadNextObject(), kaldi::ReadPhoneMap(), kaldi::ReadScriptFile(), kaldi::ReadSharedPhonesList(), kaldi::ReadSymbolList(), SequentialTableReaderScriptImpl< Holder >::Value(), RandomAccessTableReaderMapped< Holder >::Value(), RandomAccessTableReaderDSortedArchiveImpl< Holder >::Value(), RandomAccessTableReaderSortedArchiveImpl< Holder >::Value(), RandomAccessTableReaderUnsortedArchiveImpl< Holder >::Value(), TableWriterScriptImpl< Holder >::Write(), SequentialTableReaderArchiveImpl< Holder >::~SequentialTableReaderArchiveImpl(), and SequentialTableReaderScriptImpl< Holder >::~SequentialTableReaderScriptImpl().

58  {
59  if (rxfilename == "" || rxfilename == "-") {
60  return "standard input";
61  } else {
62  // If this call to Escape later causes compilation issues,
63  // just replace it with "return rxfilename"; it's only a
64  // pretty-printing issue.
65  return ParseOptions::Escape(rxfilename);
66  }
67 }
std::string PrintableWxfilename ( std::string  wxfilename)

PrintableWxfilename turns the filename into a more human-readable form for error reporting, i.e.

it does quoting and escaping and replaces "" or "-" with "standard output".

Definition at line 70 of file kaldi-io.cc.

References ParseOptions::Escape().

Referenced by main(), Output::Open(), Output::Output(), kaldi::TypeThreeUsage(), kaldi::TypeTwoUsage(), TableWriterArchiveImpl< Holder >::Write(), TableWriterScriptImpl< Holder >::Write(), TableWriterBothImpl< Holder >::Write(), fst::WriteFstKaldi(), kaldi::WriteScriptFile(), Output::~Output(), and PipeOutputImpl::~PipeOutputImpl().

70  {
71  if (wxfilename == "" || wxfilename == "-") {
72  return "standard output";
73  } else {
74  // If this call to Escape later causes compilation issues,
75  // just replace it with "return rxfilename"; it's only a
76  // pretty-printing issue.
77  return ParseOptions::Escape(wxfilename);
78  }
79 }
void kaldi::ReadKaldiObject ( const std::string &  filename,
C *  c 
)

Definition at line 239 of file kaldi-io.h.

References Input::Stream().

240  {
241  bool binary_in;
242  Input ki(filename, &binary_in);
243  c->Read(ki.Stream(), binary_in);
244 }
void ReadKaldiObject ( const std::string &  filename,
Matrix< float > *  m 
)

Definition at line 818 of file kaldi-io.cc.

References kaldi::ExtractObjectRange(), kaldi::ExtractRangeSpecifier(), KALDI_ERR, Matrix< Real >::Read(), and Input::Stream().

Referenced by kaldi::BuildConstArpaLm(), kaldi::Compile(), ComputeLogPosteriors(), ComputeScores(), NaturalGradientAffineComponent::Init(), AffineComponent::Init(), AffineComponentPreconditioned::Init(), AffineComponentPreconditionedOnline::Init(), PerElementScaleComponent::Init(), PerElementOffsetComponent::Init(), ConvolutionComponent::Init(), Convolutional1dComponent::Init(), FixedScaleComponent::InitFromConfig(), FixedBiasComponent::InitFromConfig(), FixedScaleComponent::InitFromString(), FixedBiasComponent::InitFromString(), main(), and kaldi::RunPerSpeaker().

819  {
820  if (!filename.empty() && filename[filename.size() - 1] == ']') {
821  // This filename seems to have a 'range'... like foo.ark:4312423[20:30].
822  // (the bit in square brackets is the range).
823  std::string rxfilename, range;
824  if (!ExtractRangeSpecifier(filename, &rxfilename, &range)) {
825  KALDI_ERR << "Could not make sense of possible range specifier in filename "
826  << "while reading matrix: " << filename;
827  }
828  Matrix<float> temp;
829  bool binary_in;
830  Input ki(rxfilename, &binary_in);
831  temp.Read(ki.Stream(), binary_in);
832  if (!ExtractObjectRange(temp, range, m)) {
833  KALDI_ERR << "Error extracting range of object: " << filename;
834  }
835  } else {
836  // The normal case, there is no range.
837  bool binary_in;
838  Input ki(filename, &binary_in);
839  m->Read(ki.Stream(), binary_in);
840  }
841 }
bool ExtractObjectRange(const Matrix< Real > &input, const std::string &range, Matrix< Real > *output)
The template is specialized with a version that actually does something, for types Matrix and ...
Definition: kaldi-holder.cc:27
#define KALDI_ERR
Definition: kaldi-error.h:127
bool ExtractRangeSpecifier(const std::string &rxfilename_with_range, std::string *data_rxfilename, std::string *range)
void ReadKaldiObject ( const std::string &  filename,
Matrix< double > *  m 
)

Definition at line 843 of file kaldi-io.cc.

References kaldi::ExtractObjectRange(), kaldi::ExtractRangeSpecifier(), KALDI_ERR, Matrix< Real >::Read(), and Input::Stream().

844  {
845  if (!filename.empty() && filename[filename.size() - 1] == ']') {
846  // This filename seems to have a 'range'... like foo.ark:4312423[20:30].
847  // (the bit in square brackets is the range).
848  std::string rxfilename, range;
849  if (!ExtractRangeSpecifier(filename, &rxfilename, &range)) {
850  KALDI_ERR << "Could not make sense of possible range specifier in filename "
851  << "while reading matrix: " << filename;
852  }
853  Matrix<double> temp;
854  bool binary_in;
855  Input ki(rxfilename, &binary_in);
856  temp.Read(ki.Stream(), binary_in);
857  if (!ExtractObjectRange(temp, range, m)) {
858  KALDI_ERR << "Error extracting range of object: " << filename;
859  }
860  } else {
861  // The normal case, there is no range.
862  bool binary_in;
863  Input ki(filename, &binary_in);
864  m->Read(ki.Stream(), binary_in);
865  }
866 }
bool ExtractObjectRange(const Matrix< Real > &input, const std::string &range, Matrix< Real > *output)
The template is specialized with a version that actually does something, for types Matrix and ...
Definition: kaldi-holder.cc:27
#define KALDI_ERR
Definition: kaldi-error.h:127
bool ExtractRangeSpecifier(const std::string &rxfilename_with_range, std::string *data_rxfilename, std::string *range)
void kaldi::WriteKaldiObject ( const C &  c,
const std::string &  filename,
bool  binary 
)
inline