node package manager

lassie

A watchdog service

Lassie is a simple watchdog service written in CoffeeScript. It sports a basic modular architecture that can host multiple types of service checks and alerts.

Lassie doesn't do any graphing or statistical collection. You probably want statsd+Graphite or Munin if you're looking for something like that. Lassie will notify you if a service goes down (and comes back up), nothing more.

A few standard checks and alerts are included with Lassie. If you know Javascript or CoffeeScript, then it is easy to create your own checks and alerts.

  • ping: Basic ICMP echo request. A cheap and not-foolproof way of checking if a server is alive on a network. Note that a server can respond to pings but still be "down" in some sense, so this rarely a sufficient check by itself.
  • web: Retrieve a remote URL and look in the resulting payload for a specific fragment string. Can be useful to assess whether the payload is the expected body or whether it's an error message.
  • tcp: Similar to the web check, but uses a raw TCP connection.
  • rest: Retrieve a RESTful API endpoint and check if the HTTP status code was 200. If not, then it will be considered to be in a fail state.
  • email: 'nuff said.
  • sms: Uses Twilio, so you will need an account with them.
  • slack: You'll need to create a Bot Integration within your Slack account and use the provided API token.
  • pushover: A SaaS product that sends push alerts to your phone. See their website for more details.
options:
  # Check every X seconds
  check_frequency: 60

  # Run as a daemon
  daemon: true
  log:    lassie.log
  pid:    lassie.pid

  # Twilio API credentials
  twilio:
    sid:   TWILIO_SID
    token: TWILIO_TOKEN
    phnum: TWILIO_PHNUM   # outgoing phone number

  # Slack API token and target channels/users
  slack:
    token: SLACK_TOKEN
    channels:
      - 'monitoring'
    users:
      - 'judd'

#
# ALERTS LEVELS + CONTACTS
#
alerts:
  notify:
    admin-email:
      type:  email
      to: admin@example.com
    team-chat:
      type: slack
      channels: ['monitoring']
  emerg:
    admin-sms:
      type:  sms
      phone: "+18001234567"
    admin-slack:
      type: slack
      users: ['judd']

#
# CHECKS
#
checks:
  server1:
    type:   ping
    host:   server1.example.com
    alerts: [emerg, notify]
  server2:
    type:   ping
    host:   server2.example.com
    alerts: [emerg]

  site-web:
    type:     web
    url:      http://www.example.com
    fragment: "This is an example site"
    # This check must fail twice in a row before we consider it down.
    failures: 2
    alerts:   [emerg, notify]

  site-api:
    type: rest
    alerts: [emerg]
    url: 'https://api.example.com/test_endpoint'
    headers:
      accept: application/json
      x-api-key: abc123456