VectorClusterable Class Reference

VectorClusterable wraps vectors in a form accessible to generic clustering algorithms. More...

#include <clusterable-classes.h>

Inheritance diagram for VectorClusterable:
Collaboration diagram for VectorClusterable:

Public Member Functions

 VectorClusterable ()
 
 VectorClusterable (const Vector< BaseFloat > &vector, BaseFloat weight)
 
virtual std::string Type () const
 Return a string that describes the inherited type. More...
 
virtual BaseFloat Objf () const
 Return the objective function associated with the stats [assuming ML estimation]. More...
 
virtual void SetZero ()
 Set stats to empty. More...
 
virtual void Add (const Clusterable &other_in)
 Add other stats. More...
 
virtual void Sub (const Clusterable &other_in)
 Subtract other stats. More...
 
virtual BaseFloat Normalizer () const
 Return the normalizer (typically, count) associated with the stats. More...
 
virtual ClusterableCopy () const
 Return a copy of this object. More...
 
virtual void Scale (BaseFloat f)
 Scale the stats by a positive number f [not mandatory to supply this]. More...
 
virtual void Write (std::ostream &os, bool binary) const
 Write data to stream. More...
 
virtual ClusterableReadNew (std::istream &is, bool binary) const
 Read data from a stream and return the corresponding object (const function; it's a class member because we need access to the vtable so generic code can read derived types). More...
 
virtual ~VectorClusterable ()
 
- Public Member Functions inherited from Clusterable
virtual ~Clusterable ()
 
virtual BaseFloat ObjfPlus (const Clusterable &other) const
 Return the objective function of the combined object this + other. More...
 
virtual BaseFloat ObjfMinus (const Clusterable &other) const
 Return the objective function of the subtracted object this - other. More...
 
virtual BaseFloat Distance (const Clusterable &other) const
 Return the objective function decrease from merging the two clusters, negated to be a positive number (or zero). More...
 

Private Member Functions

void Read (std::istream &is, bool binary)
 

Private Attributes

double weight_
 
Vector< double > stats_
 
double sumsq_
 

Detailed Description

VectorClusterable wraps vectors in a form accessible to generic clustering algorithms.

Each vector is associated with a weight; these could be 1.0. The objective function (to be maximized) is the negated sum of squared distances from the cluster center to each vector, times that vector's weight.

Definition at line 121 of file clusterable-classes.h.

Constructor & Destructor Documentation

◆ VectorClusterable() [1/2]

VectorClusterable ( )
inline

Definition at line 123 of file clusterable-classes.h.

◆ VectorClusterable() [2/2]

VectorClusterable ( const Vector< BaseFloat > &  vector,
BaseFloat  weight 
)

Definition at line 296 of file clusterable-classes.cc.

References KALDI_ASSERT, VectorBase< Real >::Scale(), VectorClusterable::stats_, VectorClusterable::sumsq_, and kaldi::VecVec().

297  :
298  weight_(weight), stats_(vector), sumsq_(0.0) {
299  stats_.Scale(weight);
300  KALDI_ASSERT(weight >= 0.0);
301  sumsq_ = VecVec(vector, vector) * weight;
302 }
void Scale(Real alpha)
Multiplies all elements by this constant.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:37

◆ ~VectorClusterable()

virtual ~VectorClusterable ( )
inlinevirtual

Definition at line 139 of file clusterable-classes.h.

139 {}

Member Function Documentation

◆ Add()

void Add ( const Clusterable other)
virtual

Add other stats.

Implements Clusterable.

Definition at line 225 of file clusterable-classes.cc.

References KALDI_ASSERT, Clusterable::Type(), and VectorClusterable::weight_.

Referenced by kaldi::TestClusterUtilsVector().

225  {
226  KALDI_ASSERT(other_in.Type() == "vector");
227  const VectorClusterable *other =
228  static_cast<const VectorClusterable*>(&other_in);
229  weight_ += other->weight_;
230  stats_.AddVec(1.0, other->stats_);
231  sumsq_ += other->sumsq_;
232 }
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
void AddVec(const Real alpha, const VectorBase< OtherReal > &v)
Add vector : *this = *this + alpha * rv (with casting between floats and doubles) ...

◆ Copy()

Clusterable * Copy ( ) const
virtual

Return a copy of this object.

Implements Clusterable.

Definition at line 255 of file clusterable-classes.cc.

References VectorClusterable::stats_, VectorClusterable::sumsq_, and VectorClusterable::weight_.

255  {
257  ans->weight_ = weight_;
258  ans->sumsq_ = sumsq_;
259  ans->stats_ = stats_;
260  return ans;
261 }

◆ Normalizer()

virtual BaseFloat Normalizer ( ) const
inlinevirtual

Return the normalizer (typically, count) associated with the stats.

Implements Clusterable.

Definition at line 134 of file clusterable-classes.h.

References GaussClusterable::Copy(), GaussClusterable::ReadNew(), GaussClusterable::Scale(), and GaussClusterable::Write().

134 { return weight_; }

◆ Objf()

BaseFloat Objf ( ) const
virtual

Return the objective function associated with the stats [assuming ML estimation].

Implements Clusterable.

Definition at line 305 of file clusterable-classes.cc.

References KALDI_WARN, VectorClusterable::stats_, VectorClusterable::sumsq_, kaldi::VecVec(), and VectorClusterable::weight_.

Referenced by kaldi::TestClusterUtilsVector().

305  {
306  double direct_sumsq;
307  if (weight_ > std::numeric_limits<BaseFloat>::min()) {
308  direct_sumsq = VecVec(stats_, stats_) / weight_;
309  } else {
310  direct_sumsq = 0.0;
311  }
312  // ans is a negated weighted sum of squared distances; it should not be
313  // positive.
314  double ans = -(sumsq_ - direct_sumsq);
315  if (ans > 0.0) {
316  if (ans > 1.0) {
317  KALDI_WARN << "Positive objective function encountered (treating as zero): "
318  << ans;
319  }
320  ans = 0.0;
321  }
322  return ans;
323 }
#define KALDI_WARN
Definition: kaldi-error.h:150
Real VecVec(const VectorBase< Real > &a, const VectorBase< Real > &b)
Returns dot product between v1 and v2.
Definition: kaldi-vector.cc:37

◆ Read()

void Read ( std::istream &  is,
bool  binary 
)
private

Definition at line 286 of file clusterable-classes.cc.

References kaldi::ExpectToken(), and kaldi::ReadBasicType().

Referenced by VectorClusterable::ReadNew().

286  {
287  ExpectToken(is, binary, "VCL"); // magic string.
288  ExpectToken(is, binary, "<Weight>");
289  ReadBasicType(is, binary, &weight_);
290  ExpectToken(is, binary, "<Sumsq>");
291  ReadBasicType(is, binary, &sumsq_);
292  ExpectToken(is, binary, "<Stats>");
293  stats_.Read(is, binary);
294 }
void ReadBasicType(std::istream &is, bool binary, T *t)
ReadBasicType is the name of the read function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:55
void ExpectToken(std::istream &is, bool binary, const char *token)
ExpectToken tries to read in the given token, and throws an exception on failure. ...
Definition: io-funcs.cc:191
void Read(std::istream &in, bool binary, bool add=false)
Read function using C++ streams.

◆ ReadNew()

Clusterable * ReadNew ( std::istream &  os,
bool  binary 
) const
virtual

Read data from a stream and return the corresponding object (const function; it's a class member because we need access to the vtable so generic code can read derived types).

Implements Clusterable.

Definition at line 280 of file clusterable-classes.cc.

References VectorClusterable::Read().

280  {
282  vc->Read(is, binary);
283  return vc;
284 }

◆ Scale()

void Scale ( BaseFloat  f)
virtual

Scale the stats by a positive number f [not mandatory to supply this].

Reimplemented from Clusterable.

Definition at line 263 of file clusterable-classes.cc.

References KALDI_ASSERT.

263  {
264  KALDI_ASSERT(f >= 0.0);
265  weight_ *= f;
266  stats_.Scale(f);
267  sumsq_ *= f;
268 }
void Scale(Real alpha)
Multiplies all elements by this constant.
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185

◆ SetZero()

virtual void SetZero ( )
inlinevirtual

Set stats to empty.

Implements Clusterable.

Definition at line 131 of file clusterable-classes.h.

References GaussClusterable::Add(), MatrixBase< Real >::Set(), GaussClusterable::stats_, and GaussClusterable::Sub().

131 { weight_ = 0.0; sumsq_ = 0.0; stats_.Set(0.0); }
void Set(Real f)
Set all members of a vector to a specified value.

◆ Sub()

void Sub ( const Clusterable other)
virtual

Subtract other stats.

Implements Clusterable.

Definition at line 234 of file clusterable-classes.cc.

References KALDI_ASSERT, KALDI_WARN, Clusterable::Type(), and VectorClusterable::weight_.

234  {
235  KALDI_ASSERT(other_in.Type() == "vector");
236  const VectorClusterable *other =
237  static_cast<const VectorClusterable*>(&other_in);
238  weight_ -= other->weight_;
239  sumsq_ -= other->sumsq_;
240  stats_.AddVec(-1.0, other->stats_);
241  if (weight_ < 0.0) {
242  if (weight_ < -0.1 && weight_ < -0.0001 * fabs(other->weight_)) {
243  // a negative weight may indicate an algorithmic error if it is
244  // encountered.
245  KALDI_WARN << "Negative weight encountered " << weight_;
246  }
247  weight_ = 0.0;
248  }
249  if (weight_ == 0.0) {
250  sumsq_ = 0.0;
251  stats_.Set(0.0);
252  }
253 }
#define KALDI_WARN
Definition: kaldi-error.h:150
#define KALDI_ASSERT(cond)
Definition: kaldi-error.h:185
void Set(Real f)
Set all members of a vector to a specified value.
void AddVec(const Real alpha, const VectorBase< OtherReal > &v)
Add vector : *this = *this + alpha * rv (with casting between floats and doubles) ...

◆ Type()

virtual std::string Type ( ) const
inlinevirtual

Return a string that describes the inherited type.

Implements Clusterable.

Definition at line 128 of file clusterable-classes.h.

References GaussClusterable::Objf().

128 { return "vector"; }

◆ Write()

void Write ( std::ostream &  os,
bool  binary 
) const
virtual

Write data to stream.

Implements Clusterable.

Definition at line 270 of file clusterable-classes.cc.

References kaldi::WriteBasicType(), and kaldi::WriteToken().

270  {
271  WriteToken(os, binary, "VCL"); // magic string.
272  WriteToken(os, binary, "<Weight>");
273  WriteBasicType(os, binary, weight_);
274  WriteToken(os, binary, "<Sumsq>");
275  WriteBasicType(os, binary, sumsq_);
276  WriteToken(os, binary, "<Stats>");
277  stats_.Write(os, binary);
278 }
void Write(std::ostream &Out, bool binary) const
Writes to C++ stream (option to write in binary).
void WriteToken(std::ostream &os, bool binary, const char *token)
The WriteToken functions are for writing nonempty sequences of non-space characters.
Definition: io-funcs.cc:134
void WriteBasicType(std::ostream &os, bool binary, T t)
WriteBasicType is the name of the write function for bool, integer types, and floating-point types...
Definition: io-funcs-inl.h:34

Member Data Documentation

◆ stats_

Vector<double> stats_
private

◆ sumsq_

double sumsq_
private

◆ weight_


The documentation for this class was generated from the following files: