"watchmen", a service monitor for node.js
- monitor service health (outages, uptime, response time warnings, avg. response time, etc) in your servers (
- use the database of your choice. Data storages are pluggable. At this time, only redis storage is available, but it is pretty easy to create and plug your own. There are plans to support
mongodbin the short term.
- ping types are pluggable. At this time,
smtp(tcp connection check) are available.
- watchmen provides customizable notifications if service is down, the response time is over a predefined limit, etc..
- the code base aims to be small, simple and easy to understand and modify.
There is a related blog post about watchmen here.
Check the web interface in action here.
Watchmen depends on the following modules:
Make sure you install those dependencies:
$ npm install
a) Define hosts and services to be monitored:
You need at least one service for each host. Define the ping service type for each host or service.
Most of the properties can be defined either at host or service level. Service level properties will be prioritized.
//-------------------//config/hosts.js//-------------------//example of http ping for a host with 2 url'sname:'letsnode blog'host: 'letsnode.com'port:80ping_interval: one_minute //set ping interval (in seconds)ping_service_name: 'http' //if ping_service_name is not defined, 'http' is used by defaultfailed_ping_interval: one_minute //set ping interval if site is down (in seconds)enabled: true //enables/disables this hostalert_to: 'email@example.com' //emails to alert if site goes down.warning_if_takes_more_than: 700 //miliseconds. alert if request takes more than thisservices :name : 'home'method: 'get'url : '/'//expected status code and expected string to be found in the response (otherwise will fail)expected: statuscode: 200 contains: 'A blog about node.js and express.js'name : 'contact page'method: 'get'url : '/contact'expected: statuscode: 200 contains: 'Contact page'//example of smtp pingname:'mydomain'host: 'mydomain.com'port:25ping_interval: one_minute //set ping interval (in seconds)ping_service_name: 'smtp'failed_ping_interval: one_minuteenabled: truealert_to: 'firstname.lastname@example.org' //emails to alert if site goes down.warning_if_takes_more_than: 700 //miliseconds. alert if request takes more than thisservices :name : 'my smtp server'
Using http ping service, you can also check for a) certain http status code or b) a certain text in the response stream.
host.ping_service_name(use 'http', although it is the default value)
service.expected(expected status code, expected text to be found in response)
b) Define Postmark and notifications settings:
//-------------------// config/general.js//-------------------moduleexportsnotifications =enabled: false //if disabled, no email will be sent (just console messages)to: 'email@example.com'postmark :from: 'firstname.lastname@example.org'api_key : 'your-postmark-key-here'
c) Configure the storage provider
//-------------------// config/storage.js//-------------------moduleexports =//---------------------------// Select storage provider.// Supported providers: 'redis' (only redis at this time)//---------------------------provider : 'redis'options ://---------------------------// redis configuration//---------------------------'redis' :port: 1216host: '127.0.0.1'db: 1;
Installation and configuration
- get redis from redis.io
- launch the server:
$ redis-server redis.conf
d) Add custom logic
Example: log in the console and send email if there is an outage:
Run the monitor server
$ node server.js
or more probably you would want to use forever to run it in the background
$ forever start watchmen.js
Run the web app
$ forever start webserver/app.js 3000 #(where 3000 is the port you want to use).
Run the tests with mocha:
$ npm test
1.0.alpha1 Major changes and improvements
- Storages are now pluggable.
redisstorage is used by default but you can create your own :
mongodb, text file, etc (see lib/storage).
- Ping services are also pluggable now. So far you can use
smtpis just checking tcp connection right now). You can create your own or improve the existent ones easily.
- Watchmen daemon now inherits from
events.EventEmitter, so you can instanciate it and subscribe to the events of your choice (service_error, service_back, etc) to implement your custom logic (see server.js).
- Knockout.js has been removed. Watchmen uses handlebars now instead. Faster, simpler code, and avoids some client side memory leacks.
- Client side is using moment.js for rendering dates.
- Express.js routes now are handled on /routes
- Mocha is used for unit testing. Mocked storages and ping services are used.
- Configuration is now spread in separate files, under the /config directory
- Better reporting web interface. Uptime statistics. Outages count, warnings count.
- Major refactor, improved performance.
- Added tests and mocked objects for testing.
- Separate files for request, utils and watchmen library.
- Removed logging to file.
- Bug fixing when storing event. Needed to add port to redis key to make it unique.
- Added callback when sending email to registered problems in delivery.
- Targets node 0.6.x
- Added knockoutjs for view model binding.
- Auto async refresh main page.
- Filter by name in main page.
- Added counter (hosts up and down).
- UI Improvements.
- Tablesorter sorts status and time tags.
- Added Google Analytics.
- Added current status info (site is up or down) to database.
- Added icons to display status (disable, error or ok).
- TableSorter jQuery plugin orders by status by default.
- Added expiration time to event records.
- Stores avg response time for each url.
- Warns if response time > limit.
- Multiple recipients in notifications.
- Removed "retry_in" option. Watchmen works in a smarter way now.
- REDIS backend.
- Web UI to display reports (express.js app using REDIS backend).
- Be able to disable entries in config file at url level
- When site is back, displays and logs information about how long the site has been down.
- Logs "site down" and "site back up" messages to a file (logs in a different file per host)
- Fix bug when reading url_conf.attempts on site back.
- Allow POST method (for testing forms).
- Added Marak/colors.js to output success and error messages.
- Displays request duration time.
- First release.
- Iván Loire (@ivanloire)
- Odenius (https://github.com/Odenius)
- Nibbler999 (https://github.com/Nibbler999)
- Event pagination in service details
- Twitter integration (pipe events to a twitter account)
- Security (authentication for accesing the web UI and or editing stuff)
- Google charts
- Change configuration from control panel
- Reset stats from control panel
- Regular expressions support
Copyright (c) 2012 Iván Loire
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.