redact-phi

1.0.1 • Public • Published

Build Status tracker

redact-PHI

A command-line utility to remove, redact and fabricate PHI, PII or protected data (CSV, JSON, XLSX).

Online Demo

Browser-based version

Local Installation

Install globally using npm i -g redact-phi

Usage

Removing PII from CSV

redact [options] <infile> [outfile]

Options:
  -V, --version            output the version number
  --delimiter <delimiter>  sets delimiter type (default: ",")
  --override <override>    Provide the location of your redaction specification
  -h, --help               display help for command

Redaction Example

redact-PHI includes an example (./example/) :

  • example.csv - CSV source file (data to be redacted)
  • example.json - JSON redaction strategy (defines how columns will be redacted with: faker, a custom redactor or a constant)
  • example.js - JavaScript custom redactors (this file is optional)

The .csv, .json and .js files must all be named the same (e.g. data.csv, data.json, data.js)


example/example.csv :

id,email,ssn,first name,last name,full address,order date,order amount
783,Nicola42@yahoo.com,706-25-4558,Caroline,Goodwin,82703 Yasmeen Corner Apt. 379,"10/9/2020, 4:48:52 AM",252.94
784,Ned_Bernier13@yahoo.com,282-57-6226,Izabella,Bosco,2930 Alisa Heights Apt. 572,"3/27/2021, 11:51:26 PM",689.40
784,Ned_Bernier13@yahoo.com,282-57-6226,Izabella,Bosco,2930 Alisa Heights Apt. 572,"6/17/2021, 9:44:54 PM",446.29
786,Nathen_Kovacek64@hotmail.com,677-86-2303,Celine,Dicki,68305 Labadie Shoal Suite 608,"4/23/2021, 6:49:34 PM",889.37
...

example/example.json :

{
  "columns": [
    {
      "redactWith": "random.number",
      "columnNum": "",
      "columnKey": "order id",
      "tracked": false
    },
    {
      "redactWith": "customGenerateId",
      "columnNum": 0,
      "columnKey": "",
      "tracked": true
    },
...

The columns array contains objects with properties that describe how each column should be redacted:

  • columnNum - (integer) Identifies the column to be redacted with an integer index (zero-based). (columnNum or columnKey must be set)

  • columnKey - (string) Identifies the column to be redacted using the file's header. (columnNum or columnKey must be set)

  • redactWith - (string) The strategy used to redact a value, there are four options:

    • faker function - Available functions : name.firstName or internet.email
    • faker template - Faker methods in a mustache template : {{address.streetAddress(true)}} or {{address.city}}, {{address.state}} {{address.zip}}
    • custom JavaScript - For more complicated redacted values you can call a JavaScript function defined in your .js file (see example.js)
    • constant value - If you wish to replace every value in this column with a constant, e.g. John Doe
  • tracked - Tracking preserves the relationships within your data while de-identifying. With tracking enabled, the redaction engine tracks a column's original value and reuses the redacted value if the original is re-encountered. To see this in-action, run redact /example/example.csv and note how columns with tracking enabled (id,ssn,email) are redacted with the same value in example_redacted.csv.


example/example.js :

module.exports = {
    customGenerateId: () => {
        return ++currentId;
    },
...

Custom redactors provide more control over your data. A custom redactor is referenced from your .json file using the redactWith property. For example: redactWith: "customGenerateId"

De-identification Examples

These JSON examples will remove the following PII:

  • First Name: "redactWith": "name.firstName"
  • Last Name: "redactWith": "name.lastName"
  • Full Name: "redactWith": "name.findName"
  • Social Security number: "redactWith": "{{datatype.number({\"min\":100,\"max\":999})}}-{{datatype.number({\"min\":10,\"max\":99})}}-{{datatype.number({\"min\":1000,\"max\":9999})}}"
  • Email Address: "redactWith": "internet.email"
  • Phone / Fax number: "redactWith": "phone.phoneNumber"
  • Street Address: "redactWith": "address.streetAddress"
  • City: "redactWith": "address.city"
  • Zip Code: "redactWith": "address.zipCode"
  • City, State, Zip: "redactWith": "{{address.city()}}, {{address.stateAbbr()}} {{address.zipCode()}}"
  • Full Address: "redactWith": "{{address.streetAddress(true)}}"
  • IP: "redactWith": "internet.ip"
  • IP v6: "redactWith": "internet.ipv6"

Full De-identification example.json:

{
  "columns": [
    {
      "redactWith": "name.firstName",
      "columnNum": "",
      "columnKey": "first name",
      "tracked": false
    },
    {
      "redactWith": "name.lastName",
      "columnNum": "",
      "columnKey": "last name",
      "tracked": false
    },
    {
      "redactWith": "name.findName",
      "columnNum": "",
      "columnKey": "full name",
      "tracked": false
    },
    {
      "redactWith": "{{datatype.number({\"min\":100,\"max\":999})}}-{{datatype.number({\"min\":10,\"max\":99})}}-{{datatype.number({\"min\":1000,\"max\":9999})}}",
      "columnNum": "",
      "columnKey": "social security number",
      "tracked": false
    },
    {
      "redactWith": "internet.email",
      "columnNum": "",
      "columnKey": "email",
      "tracked": false
    },
    {
      "redactWith": "phone.phoneNumber",
      "columnNum": "",
      "columnKey": "phone",
      "tracked": false
    },
    {
      "redactWith": "address.city",
      "columnNum": "",
      "columnKey": "city",
      "tracked": false
    },
    {
      "redactWith": "address.zipCode",
      "columnNum": "",
      "columnKey": "zip",
      "tracked": false
    },
    {
      "redactWith": "address.streetAddress",
      "columnNum": "",
      "columnKey": "street address",
      "tracked": false
    },
    {
      "redactWith": "{{address.city()}}, {{address.stateAbbr()}} {{address.zipCode()}}",
      "columnNum": "",
      "columnKey": "city state zip",
      "tracked": false
    },
    {
      "redactWith": "{{address.streetAddress(true)}}",
      "columnNum": "",
      "columnKey": "full address",
      "tracked": false
    },
    {
      "redactWith": "internet.ip",
      "columnNum": "",
      "columnKey": "ip",
      "tracked": false
    },
    {
      "redactWith": "internet.ipv6",
      "columnNum": "",
      "columnKey": "ipv6",
      "tracked": false
    }
  ]
}

Redaction Strategy JSON Schema

JSON Schema for creating a spec

Package Sidebar

Install

npm i redact-phi

Weekly Downloads

1

Version

1.0.1

License

Apache-2.0

Unpacked Size

33.3 kB

Total Files

15

Last publish

Collaborators

  • bearjaws
  • frizzled