cluster-utils.cc File Reference
#include <functional>
#include <queue>
#include <vector>
#include "base/kaldi-math.h"
#include "util/stl-utils.h"
#include "tree/cluster-utils.h"
Include dependency graph for cluster-utils.cc:

Go to the source code of this file.

Classes

class  BottomUpClusterer
 
struct  CompBotClustElem
 
class  CompartmentalizedBottomUpClusterer
 
class  RefineClusterer
 
struct  RefineClusterer::point_info
 
class  TreeClusterer
 
struct  TreeClusterer::Node
 

Namespaces

 kaldi
 This code computes Goodness of Pronunciation (GOP) and extracts phone-level pronunciation feature for mispronunciations detection tasks, the reference:
 

Typedefs

typedef uint16 uint_smaller
 
typedef int16 int_smaller
 

Functions

BaseFloat SumClusterableObjf (const std::vector< Clusterable * > &vec)
 Returns the total objective function after adding up all the statistics in the vector (pointers may be NULL). More...
 
BaseFloat SumClusterableNormalizer (const std::vector< Clusterable * > &vec)
 Returns the total normalizer (usually count) of the cluster (pointers may be NULL). More...
 
Clusterable * SumClusterable (const std::vector< Clusterable * > &vec)
 Sums stats (ptrs may be NULL). Returns NULL if no non-NULL stats present. More...
 
void EnsureClusterableVectorNotNull (std::vector< Clusterable * > *stats)
 Fills in any (NULL) holes in "stats" vector, with empty stats, because certain algorithms require non-NULL stats. More...
 
void AddToClusters (const std::vector< Clusterable * > &stats, const std::vector< int32 > &assignments, std::vector< Clusterable * > *clusters)
 Given stats and a vector "assignments" of the same size (that maps to cluster indices), sums the stats up into "clusters." It will add to any stats already present in "clusters" (although typically "clusters" will be empty when called), and it will extend with NULL pointers for any unseen indices. More...
 
void AddToClustersOptimized (const std::vector< Clusterable * > &stats, const std::vector< int32 > &assignments, const Clusterable &total, std::vector< Clusterable * > *clusters)
 AddToClustersOptimized does the same as AddToClusters (it sums up the stats within each cluster, except it uses the sum of all the stats ("total") to optimize the computation for speed, if possible. More...
 
BaseFloat ClusterBottomUp (const std::vector< Clusterable * > &points, BaseFloat thresh, int32 min_clust, std::vector< Clusterable * > *clusters_out, std::vector< int32 > *assignments_out)
 A bottom-up clustering algorithm. More...
 
bool operator> (const CompBotClustElem &a, const CompBotClustElem &b)
 
BaseFloat ClusterBottomUpCompartmentalized (const std::vector< std::vector< Clusterable * > > &points, BaseFloat thresh, int32 min_clust, std::vector< std::vector< Clusterable * > > *clusters_out, std::vector< std::vector< int32 > > *assignments_out)
 This is a bottom-up clustering where the points are pre-clustered in a set of compartments, such that only points in the same compartment are clustered together. More...
 
BaseFloat RefineClusters (const std::vector< Clusterable * > &points, std::vector< Clusterable * > *clusters, std::vector< int32 > *assignments, RefineClustersOptions cfg=RefineClustersOptions())
 RefineClusters is mainly used internally by other clustering algorithms. More...
 
BaseFloat ClusterKMeansOnce (const std::vector< Clusterable *> &points, int32 num_clust, std::vector< Clusterable *> *clusters_out, std::vector< int32 > *assignments_out, ClusterKMeansOptions &cfg)
 ClusterKMeansOnce is called internally by ClusterKMeans; it is equivalent to calling ClusterKMeans with cfg.num_tries == 1. More...
 
BaseFloat ClusterKMeans (const std::vector< Clusterable * > &points, int32 num_clust, std::vector< Clusterable * > *clusters_out, std::vector< int32 > *assignments_out, ClusterKMeansOptions cfg=ClusterKMeansOptions())
 ClusterKMeans is a K-means-like clustering algorithm. More...
 
BaseFloat TreeCluster (const std::vector< Clusterable * > &points, int32 max_clust, std::vector< Clusterable * > *clusters_out, std::vector< int32 > *assignments_out, std::vector< int32 > *clust_assignments_out, int32 *num_leaves_out, TreeClusterOptions cfg=TreeClusterOptions())
 TreeCluster is a top-down clustering algorithm, using a binary tree (not necessarily balanced). More...
 
BaseFloat ClusterTopDown (const std::vector< Clusterable * > &points, int32 max_clust, std::vector< Clusterable * > *clusters_out, std::vector< int32 > *assignments_out, TreeClusterOptions cfg=TreeClusterOptions())
 A clustering algorithm that internally uses TreeCluster, but does not give you the information about the structure of the tree. More...