Unsupervised-KNN-JS
Node.JS package for computing the k nearest neighbors to an input vector using distance calculations. Computations are implemented in Rust for high perfromance and parallelism.
Table of Contents
Features
- Parallelized distance computations
- Fast native system processing
- 14 popular distance functions
- Out of the box support on Linux, OSX, and Windows
- Support for Node 8, 10, 12, and 13
Install
$ npm i unsupervised-knn-js
Import
const knn =
Example
> const knn = > const neighbors = label: 'some name' vector: 1 2 4 5 label: 'name 2' vector: 14 4 13 2 label: 'another name' vector: 4 4 4 5 > const target = 1 2 3 4> const algo = 'euclidean'> const k = 2 > label: 'some name' distance: 14142135623730951 label: 'another name' distance: 3872983346207417 >
Usage
Parameters
The knn function takes 4 parameters:
- Algorithm String
- This is the algorithm which computes distances between the target and all neighbors
- The current algorithms natively supported are:'euclidean' // L2 Norm Difference'cosine' // Cosine Distance'mae' // Mean-Absolute-Error'mse' // Mean-Squared-Error'manhattan' // Sum of Absolute Differences'ssd' // Sum of Squared Differences'canberra' // Weighted Manhatten Distance'hamming' // Sum of Binary Differences'L3' // L3 Norm Difference'L4' // L4 Norm Difference'L5' // L5 Norm Difference'L10' // L10 Norm Difference'chebyshev' // L-Infinite Norm Difference'pearson' // Pearson Correlation Distance
- K-Value
- The amount of closest neighbors to the target point to return
- So if k = 2, the 2 closests neighbors to the target vector will be returned.
- Neighbors
- This is an array of objects where each object represents a neighbor or point
- Each object should have a label and vector field as such:label: 'name or id'vector: 1 3 45 -4
- The following is a valid array of neighbors:const neighbors =label: 'some name' vector: 1 2 4 5label: 'name 2' vector: 14 4 13 2label: 'another name' vector: 4 4 4 5
- Target
- This is the vector for which to find the closest or most similar points to
- This should be an array of numbers
Return
The function returns an array of objects representing the closest points to the target.
Each object has a label field for identification and a distance field which represents it's difference from the target.
label: 'some name' distance: 14142135623730951 label: 'another name' distance: 3872983346207417
This list is ordered in ascending order based on the distance field in each object.
Distance Comparisons
Here is an example of the same data run against different distance functions
> const knn = > const neighbors = label: 'some name' vector: 1 2 4 5 label: 'another name' vector: 4 4 4 5 label: 'name 3' vector: 14 4 13 2 > const target = 1 2 3 4 > // Euclidean> label: 'some name' distance: 14142135623730951 label: 'another name' distance: 3872983346207417 label: 'name 3' distance: 1664331697709324 > // Cosine> label: 'some name' distance: 0003993481192393733 label: 'another name' distance: 0059777545024485734 label: 'name 3' distance: 035796589482505503 > // Mean-Absolute-Error > label: 'some name' distance: 05 label: 'another name' distance: 175 label: 'name 2' distance: 675 > // Mean-Squared-Error> label: 'some name' distance: 05 label: 'another name' distance: 375 label: 'name 3' distance: 6925 > // Manhattan> label: 'some name' distance: 2 label: 'another name' distance: 7 label: 'name 3' distance: 27 > // Sum of Squared Differences> label: 'some name' distance: 2 label: 'another name' distance: 15 label: 'name 2' distance: 277 > // Canberra> label: 'some name' distance: 025396825396825395 label: 'another name' distance: 11873015873015873 label: 'name 3' distance: 2158333333333333 > // Hamming> label: 'some name' distance: 2 label: 'another name' distance: 4 label: 'name 3' distance: 4 > // L3 Norm Difference> label: 'some name' distance: 12599210498948732 label: 'another name' distance: 3332221851645953 label: 'name 3' distance: 14756054203376182 > // L4 Norm Difference> label: 'some name' distance: 1189207115002721 label: 'another name' distance: 31543421455299043 label: 'name 3' distance: 14016098305349052 > // L5 Norm Difference> label: 'some name' distance: 1148698354997035 label: 'another name' distance: 30796116495812957 label: 'name 3' distance: 13635466232760923 > // L10 Norm Difference> label: 'some name' distance: 10717734625362931 label: 'another name' distance: 30051723058500506 label: 'name 2' distance: 13091355843137347 > // Chebyshev> label: 'some name' distance: 1 label: 'another name' distance: 3 label: 'name 3' distance: 13 > // Pearson Correlation Distance> label: 'some name' distance: 0010050506338833642 label: 'another name' distance: 02254033307585166 label: 'name 3' distance: 15685785754425927
Future Features
- Even more native distance functions
- Potential implemention of custom distance functions passed in by the user
Ideas and suggestions are welcome!
Changes
For changes please see the Changelog