Clusterfck
A js cluster analysis library. Includes Hierarchical (agglomerative) clustering and K-means clustering. Demo here.
Install
With Node.js:
npm install tayden-clusterfck
With Bower
bower install tayden-clusterfck
Or grab the browser file
K-means
var clusterfck = ; var colors = 20 20 80 22 22 90 250 255 253 0 30 70 200 0 23 100 54 100 255 13 8; // Calculate clusters.var clusters = clusterfck;
The second argument to kmeans
is the number of clusters you want (default is Math.sqrt(n/2)
where n
is the number of vectors). It returns an array of clusters, for this example:
200023 255138 202080 222290 03070 10054100 250255253
Classification
For classification, instantiate a new Kmeans() object.
var kmeans = ; // Calculate clusters.var clusters = kmeans; // Calculate cluster index for a new data point.var clusterIndex = kmeans;
Serialization
The toJSON() and fromJSON() methods are available for serialization.
// Serialize centroids to JSON.var json = kmeans; // Deserialize centroids from JSON.kmeans = kmeans; // Calculate cluster index from a previously serialized set of centroids.var clusterIndex = kmeans;
Initializing with Existing Centroids
// Take existing centroids, perhaps from a database?var centroids = 355 315 85 250 255 253 2275 65 155 ; // Initialize constructor with centroids.var kmeans = centroids; // Calculate cluster index.var clusterIndex = kmeans;
Accessing Centroids and K value
After clustering or loading via fromJSON(), the calculated centers are accessible via the centroids property. Similarly, the K-value can be derived via centroids.length.
// Calculate clusters.var clusters = kmeans; // Access centroids, an array of length 3.var centroids = kmeanscentroids; // Access k-value.var k = centroidslength;
Hierarchical
var clusterfck = ; var colors = 20 20 80 22 22 90 250 255 253 100 54 255; var clusters = clusterfck;
hcluster
returns an object with keys tree
and clusters
. tree
includes the hierarchy of the clusters with left
and right
subtrees. The leaf clusters have a value
property which is the vector from the data set. The clusters
property is a function that when passed some integer n, will provide a list of values corresponding to n clusters determined by splitting the furthest nodes in the tree structure. The resulting list contains a list for each cluster, which in turn contain the values from the input.
//clusters.tree "left": "left": "left": "value": 22 22 90 "right": "value": 20 20 80 "right": "value": 100 54 255 "right": "value": 250 255 253 //clusters.clusters(3) 250 255 253 22 22 90 20 20 80 100 54 255
Distance metric and linkage
Specify the distance metric, one of "euclidean"
(default), "manhattan"
, and "max"
. The linkage criterion is the third argument, one of "average"
(default), "single"
, and "complete"
.
var tree = clusterfck;