Neoclassical Piano Montage

    n-health

    7.0.0 • Public • Published

    n-health CircleCI

    Collection of healthcheck classes to use in your nodejs application.
    To create a new health check, please follow the current standard

    Usage

    n-health exports a function that loads healthcheck configuration files from a folder:

    const nHealth = require('n-health');
    
    const healthChecks = nHealth(
    	'path/to/healthchecks' // by default, `/healthchecks` or `/config` in the root of your application
    )

    It returns an object with an asArray method. If you're using n-express, pass this array as the healthChecks option:

    const nExpress = require('@financial-times/n-express')
    
    nExpress({
    	healthChecks: healthChecks.asArray()
    })

    If you're not using n-express, you should create a /__health endpoint which returns the following JSON structure (see the specification for details):

    {
    	"schemaVersion": 1,
    	"name": "app name",
    	"systemCode": "biz-ops system code",
    	"description": "human readable description",
    	"checks": []
    }

    checks should be an array of check status objects. You can get this by calling getStatus on each item in the array, for example with healthChecks.asArray().map(check => check.getStatus()).

    Custom healthchecks

    If you require a healthcheck not provided by n-health, you can pass a second argument to nHealth, which should be a path to a folder of files exporting custom healthcheck classes. These modules should export a class that extends n-health's Check class and implements the tick method, which is periodically called to update the check's status. It can also implement the init to do something when the check is first run. Both of these methods can be async if you need to do something like make a request.

    const {Check, status} = require('n-health');
    
    class RandomCheck extends Check {
    	tick() {
    		this.status = Math.random() < 0.5 ? status.PASSED : status.FAILED;
    	}
    }
    
    module.exports = RandomCheck;

    See the src/checks folder for some examples.

    Healthcheck configuration

    A healthcheck config is a Javascript file that exports an object with these properties.

    • name: A name for the healthcheck - is supposed to match to a name in biz-ops, ideally
    • description: Test description for the checks - for reference only
    • checks: Array of check objects

    Check objects

    Common options

    • type: The type of check, which should be one of the types below. That check type's options should also be included in the object as required.
    • name, severity, businessImpact, technicalSummary and panicGuide are all required. See the specification for details
    • interval: time between checks in milliseconds or any string compatible with ms [default: 1minute]
    • officeHoursOnly: [default: false] For queries that will probably fail out of hours (e.g. Internet Explorer usage, B2B stuff), set this to true and the check will pass on weekends and outside office hours (defined as 8am-6pm UTC). Use sparingly.

    responseCompare

    Fetches from multiple urls and compares the responses. Useful to check that replication is working

    • urls: An array of urls to call
    • comparison: Type of comparison to apply to the responses:
      • 'equal' the check succeeds if all the responses have the same status

    json

    Calls a url, gets some json and runs a callback to check its form

    • url: url to call and get the json
    • fetchOptions: Object to pass to fetch, see https://www.npmjs.com/package/node-fetch#options for more information.
    • callback: A function to run on the response. Accepts the parsed json as an argument and should return true or false

    aggregate

    Reports on the status of other checks. Useful if you have a multi-region service and, if one check fails it is not as bad as if ALL the checks fail.

    • watch: Array of names of checks to aggregate
    • mode: Aggregate mode:
      • 'atLeastOne' the check succeeds if at least one of its subchecks succeeds

    graphiteSpike

    Compares current and historical graphite metrics to see if there is a spike

    • numerator: [required] Name of main graphite metric to count (may contain wildcards)
    • divisor: [optional] Name of graphite metric to divide by (may contain wildcards)
    • normalize: [optional] Boolean indicating whether to normalize to adjust for difference in size between sample and baseline timescales. Default is true if no divisor specified, false otherwise.
    • samplePeriod: [default: '10min'] Length of time to count metrics for a sample of current behaviour
    • baselinePeriod: [default: '7d'] Length of time to count metrics for to establish baseline behaviour
    • direction: [default: 'up'] Direction in which to look for spikes; 'up' = sharp increase in activity, 'down' = sharp decrease in activity
    • threshold: [default: 3] Amount of difference between current and baseline activity which registers as a spike e.g. 5 means current activity must be 5 times greater/less than the baseline activity

    graphiteThreshold

    Checks whether the value of a graphite metric has crossed a threshold

    • metric: [required] Name of graphite metric to count (may contain wildcards)
    • threshold: [required] Value to check the metrics against
    • samplePeriod: [default: '10min'] Length of time to count metrics for a sample of current behaviour
    • direction: [default: 'above'] Direction on which to trigger the healthcheck:
      • 'above' = alert if value goes above the threshold
      • 'below' = alert if value goes below the threshold

    graphiteWorking

    Checks if the value of a graphite metric has received data recently.

    • metric: [required] Name of graphite metric to count (may contain wildcards)
      • Use summarize if the metric receives data infrequently, e.g. summarize(next.heroku.next-article.some-infrequent-periodic-metric, '30mins', 'sum', true)
    • time: [default: '-5minutes'] Length of time to count metrics

    herokuLogDrain

    Checks whether a Heroku log drain is configured correctly for the app. We determine the validity of a log drain based on the "Logging to Splunk from Heroku" guide here.

    • herokuAuthToken = [default process.env.HEROKU_AUTH_TOKEN] an auth token with read access on the Heroku app
    • herokuAppId = [default process.env.HEROKU_APP_ID] the Heroku UUID for the Heroku app, which should be provided by the Heroku runtime-dyno-metadata feature.

    cloudWatchThreshold

    Checks whether the value of a CloudWatch metric has crossed a threshold

    Note: this assumes that AWS_ACCESS_KEY & AWS_SECRET_ACCESS_KEY are implicitly available as environment variables on process.env

    • cloudWatchRegion = [default 'eu-west-1'] AWS region the metrics are stored
    • cloudWatchMetricName = [required] Name of the CloudWatch metric to count
    • cloudWatchNamespace = [required] Namespace the metric resides in
    • cloudWatchStatistic = [default 'Sum'] Data aggregation type to return
    • cloudWatchDimensions = Optional array of metric data to query
    • samplePeriod: [default: 300] Length of time in seconds to count metrics for a sample of current behaviour
    • threshold: [required] Value to check the metrics against
    • direction: [default: 'above'] Direction on which to trigger the healthcheck:
      • 'above' = alert if value goes above the threshold
      • 'below' = alert if value goes below the threshold

    cloudWatchAlarm

    Checks whether the state of a CloudWatch alarm is health

    Note: this assumes that AWS_ACCESS_KEY & AWS_SECRET_ACCESS_KEY are implicitly available as environment variables on process.env

    • cloudWatchRegion = [default 'eu-west-1'] AWS region the metrics are stored
    • cloudWatchAlarmName = [required] Name of the CloudWatch alarm to check

    fastlyKeyExpiration

    Checks if the expiration date of a Fastly key is due for the next 2 weeks

    Note: there are some default properties ** panic guide: 'Contact the Slack channel #fastly-support to rotate the keys https://financialtimes.slack.com/archives/C2GFE1C9X' ** technicalSummary: 'Check the Fastly key in the api token information endpoint to obtain the expiration date' ** severity = 2

    • fastlyKey = The value of the fastly key to check

    Note: if the expiration date is past, the severity level is 1

    Keywords

    none

    Install

    npm i n-health

    DownloadsWeekly Downloads

    2,617

    Version

    7.0.0

    License

    ISC

    Unpacked Size

    1 MB

    Total Files

    81

    Last publish

    Collaborators

    • robgodfrey
    • hamza.samih
    • nikita.lohia
    • notlee
    • efinlay24
    • emmalewis
    • aendra
    • the-ft
    • rowanmanning
    • chee
    • alexwilson