json-indexer is a TypeScript utility for efficient indexing of large JSON files. It allows you to parse files incrementally, minimizing memory usage while building a structured index for quick access to objects. This is particularly useful for scenarios where you need to work with massive JSON files containing arrays of objects.
- Efficient Parsing: Reads JSON files in chunks to handle large files without loading the entire content into memory.
- Customizable Indexing: Allows you to define additional keys to include in the index.
- Scalable: Suitable for large-scale data processing.
- Type-Safe: Leverages TypeScript for strong typing and compile-time safety.
Install the package via npm:
npm install json-indexer
Suppose you have a large JSON file (data.json) with the following structure:
{
"shoes": [
{ "id": "1", "name": "Nike Air", "size": 42, "color": "black" },
{ "id": "2", "name": "Adidas Boost", "size": 43, "color": "white" },
...
]
}
You can use json-indexer to parse and index the shoes
array like this:
import { JsonIndexer } from 'json-indexer';
// Your data type
interface Shoe {
id: string;
name: string;
size: number;
color: string;
}
// The resulting indexed data type
interface ShoeMetadata {
// id, filePosition, and length are required
id: string;
filePosition: number;
length: number;
// Extra keys that should be added to the index
name: string;
size: number;
}
// Assume `file` is a File object representing your JSON file
const file = new File([/* file content */], "data.json", {
type: "application/json"
});
// Create an instance of JsonIndexer
const indexer = new JsonIndexer(file);
// Build the index with additional properties
const shoeIndex = await indexer.index<ShoeMetadata>("shoes", ["name", "size"]);
/**
* Output:
* Map {
* "1" => {
* id: "1",
* filePosition: 123,
* length: 456,
* name: "Nike Air",
* size: 42,
* },
* "2" => { ... }
* }
**/
// Subsequent lookups
const metadata = shoeIndex.get('1');
if (metadata) {
const chunk = file.slice(
metadata.filePosition,
metadata.filePosition + metadata.length
);
const record = JSON.parse(await chunk.text());
}
A class for indexing JSON files.
constructor(file: File, chunkSize = 1024 * 1024)
-
file
(File
): The JSON file to index. -
chunkSize
(number
, optional): Size of each chunk read from the file (default: 1 MB).
index<T>
async index<T extends { id: string, filePosition: number, length: number }>(
key: string,
additionalIndexKeys: Array<RequiredAdditionalKeys<T>> = []
): Promise<Map<string, T>>
- Generic type T must extend the base type containing
id
,filePosition
, andlength
. -
key
(string
): The key of the array to index (e.g.,"shoes"
). -
additionalIndexKeys
(Array<keyof T>
): Keys to include in the index, beyond the base requirements. - Returns a
Promise
resolving to aMap
where the keys are theid
values of the indexed objects, and the values are the indexed objects with metadata
- Memory Efficient: Processes the file in chunks, avoiding high memory usage.
- Incremental Parsing: Supports working with large files incrementally.
- Customizable Metadata: Add aditional fields to the index for detailed object representation.
- Flexible Type System: Generic type parameters at the method level for improved type safety and reusability
If you forget to include all required keys in additionalIndexKeys
, the index()
method will throw an error:
// This will throw an error because 'name' is required by the ShoeMetadata type
const index = await indexer.index<ShoeMetadata>("shoes", []);
// Error: Missing keys in additionalIndexKeys: name
This project is licensed under the MIT License.