json-dedupe
Installation
To use this module as a cli run:
npm i -g json-dedupe
For use as a npm module:
npm i json-dedupe
CLI Usage
Our CLI has a nice built in helper for the latest info and options, run:
$ json-dedupe -h
Usage: json-dedupe [options]
Options:
-V, --version output the version number
-q --quiet Toggle quiet mode
-p --perf Toggle performance output
-d --data [data] Point to the data to dedupe, either a json file or folder of json files
-h, --help output usage information
This list will always be up to date, and should be considered the source of truth for options.
output
If you point this tool to a folder or single file you will get an output of:
- What was removed and the collision key
- What remains
The output is formatted to match the initial input.
Module Usage
We've also included an export for node:
const dedupe = const data =
Documentation for Module
- removeDuplicateByKey(valueMap, duplicateSet, uniqueKey, staleKey, value)
A helper to create a Map indexed on a supposed unique key, and add all objects that violate that unique key to a provided Set of values. This is an in-place transform.
- dedupe(values, keys, staleKey) ⇒
Object
A helper to find all duplicates in an array based on
_id
andemail
. Information about end-state and what was removed is also logged out in the process.expected format:
{ _id: String, email: String, entryDate: String }
removeDuplicateByKey(valueMap, duplicateSet, uniqueKey, staleKey, value)
A helper to create a Map indexed on a supposed unique key, and add all objects that violate that unique key to a provided Set of values. This is an in-place transform.
Kind: global function
Param | Type | Description |
---|---|---|
valueMap | Map |
the Map for all of your unique values |
duplicateSet | Set |
the Set for any found duplicates |
uniqueKey | String |
the key to find as a duplicate |
staleKey | String |
the key to break a tie if a duplicate is found |
value | Object |
the value to check |
Object
dedupe(values, keys, staleKey) ⇒ A helper to find all duplicates in an array based on _id
and email
.
Information about end-state and what was removed is also logged out in
the process.
expected format:
_id: String email: String entryDate: String
Kind: global function
Returns: Object
- The deduped array and a map of the duplicates found by the key collision
Param | Type | Description |
---|---|---|
values | Array |
An array of objects that match the expect object format |
keys | Array |
An array of keys you want to dedupe on sorted by order of execution |
staleKey | String |
The key for determining which entry is stale |