HNSW-Lite is a lightweight implementation of the Hierarchical Navigable Small World (HNSW) algorithm. This algorithm is commonly used for approximate nearest neighbor (ANN) searches in high-dimensional spaces. It organizes nodes into multiple layers, where higher layers enable faster traversal and lower layers provide precise neighbor connections. HNSW is widely used for tasks like similarity search, recommendation systems, and clustering.
This implementation is written in TypeScript and is designed to be simple and customizable.
- Support for adding vectors with IDs.
- Multi-layer graph construction with customizable maximum layers and edges per node.
- Support for Euclidean distance and Cosine similarity.
- Randomized geometric distribution to assign node layers.
- Flexible and lightweight codebase.
Make sure you have Node.js and npm installed on your system.
-
Clone the repository:
git clone <repository-url> cd hnsw_lite
-
Install the required dependencies:
npm install
The project includes implementations for Euclidean distance and Cosine similarity. If no parameter is specified, Euclidean Distance will be used:
import { HNSW } from './hnsw';
// Create an HNSW graph with 3 layers and max 8 edges per node
const euclidean_hnsw = new HNSW(3, 8, 'euclideanDistance');
// Create an HNSW graph with 4 layers and max 5 edges per node
const cosine_hnsw = new HNSW(4,5, 'cosineSimilarity')
To add vectors to the HNSW graph, call the add
method with a unique ID and the vector you want to add. All vectors must be the same length:
hnsw.add('node-1', [0.1, 0.2, 0.3, 0.4]);
hnsw.add('node-2', [0.5, 0.6, 0.7, 0.8]);
hnsw.add('node-3', [0.2, 0.3, 0.4, 0.5]);
To add vectors to the HNSW graph in bulk, call the addBulk
method:
// Add multiple nodes in bulk using an array of tuples
hnsw.addBulk([
['node-1', [0.1, 0.2, 0.3, 0.4]],
['node-2', [0.5, 0.6, 0.7, 0.8]],
['node-3', [0.2, 0.3, 0.4, 0.5]],
]);
To remove vectors from the HNSW graph, call the remove method with the unique ID of the vector you want to remove. When a node is removed, the graph will attempt to maintain the maxEdges per node value specified during initialization. However, if there are not enough nodes in a given layer, the number of edges for some nodes may fall below the maxEdges constraint.
hnsw.remove('node-1')
To find the nearest neighbor(s) for a given query vector, use the query
method. By default, the search will return one result. However, you can optionally specify a value between 1 and maxEdges (inclusive) to retrieve more neighbors.
As this is a lightweight implementation, the search will attempt to find the nearest node to the query vector on the final layer of the graph. It will then return the nearest neighbors to that specific node, not directly to the query vector itself:
const result = hnsw.query([0.15, 0.25, 0.35, 0.45], 2); // Find 2 nearest neighbors
console.log('Nearest neighbors:', result); // output (string[]): ['node-1', 'node-3']
To save the HNSW graph to a JSON format and rebuild it later, use the toJSON and fromJSON methods:
const graphJSON = hnsw.toJSON(); // Convert the graph to JSON
fs.writeFileSync('hnsw_graph.json', JSON.stringify(graphJSON));
By default, when the graph is rebuilt it will use precalculated values for the distance metric (from the previous build):
const graphData = JSON.parse(fs.readFileSync('hnsw_graph.json', 'utf8'));
const restoredHnsw = HNSW.fromJSON(graphData); // Rebuild the graph from JSON
Here is an example of adding nodes and querying:
Adding node with ID: node-1, Layer: 0, Vector: [0.1, 0.2, 0.3, 0.4]
Adding node with ID: node-2, Layer: 0, Vector: [0.5, 0.6, 0.7, 0.8]
Adding node with ID: node-3, Layer: 1, Vector: [0.2, 0.3, 0.4, 0.5]
Nearest neighbors: [ 'node-1', 'node-3' ]
-
maxLayers
: Number of layers in the HNSW graph. Higher values improve search efficiency but increase memory usage. -
maxEdges
: Maximum edges per node. Higher values increase connectivity but also computation time.
Example:
const hnsw = new HNSW(4, 10); // 4 layers, max 10 edges per node
- Input vectors must have values between 0 and 1 (inclusive), as enforced by the
add
method. - This implementation is designed for educational and experimental use. It may not scale to extremely large datasets.
Feel free to fork the repository and submit pull requests. Suggestions and feature requests are welcome!
This project is licensed under the MIT License. See the LICENSE
file for more details.
- Based on the Hierarchical Navigable Small World (HNSW) paper by Yu. A. Malkov and D. A. Yashunin.
- Inspired by popular ANN libraries like FAISS and Hnswlib.