manticore

Mythical multi-process worker pool

manticore

Mythical multi-process worker pool

Fork node.js multi-core workers and crunch legendary workloads.

The core concept is you got some code in a function that does some heavy work and you want to run it many times with maximum benefit of your multi-core CPU, and without the overhead of re-spawning piles single-use sub-processes.

:warning: Early release :sunglasses:

The worker modules I found on npm all have their problems: they either lack functionality, use external dependencies or make all kinds of weird assumptions that get in the way.

Instead of trying to wrangle my app to fit those unsatisfactory modules I build Manticore to be simple and effective with the features you need to get big things done at hyperspeed without jumping through crazy hoops.

You put your code in a function that accepts a single parameter, then add a bunch of them in a worker module. In this module you register the functions to expose them as tasks.

In your main app you setup the pool for that module and execute the methods via the pool with your data parameter and Manticore will spawn (and despawn) workers as needed and distribute the work and return the result as a Promise.

You can use a function that returns a value synchronously, or go asynchronous and either use the node.js-style callback or return a Promise.

By default each worker works on only one job at a time, but there is an option to allow workers to process multiple jobs simultaneously that allows a extra boost for IO-bound tasks (of course assuming you use async IO).

The return value of the pool is always a ES6-style Promise so you easily use fancy logic like Promise.all() or Promise.race().

For some next level setups you can leverage Promise-glue helpers from modules like Q, Bluebird etc. To get creative and pass the Promises into more exotic modules like React, Baconjs, Lazy.js, Highland and all the other cool utility modules with Promise support.

Keep in mind the parameter object and return value are serialised so you cannot pass functions or prototype based objects, only simple JSON-like data.

  • Returns a ES6 Promise.
  • Transfers data between threads using pipes (eg: non-blocking).
  • Data gets serialised so only primitive JSON-like data can be transferred.
  • Makes sure you configure concurrent/paralel to suit your app for best performance
  • Swap JSON serialisation for something that supports Buffers.
  • Separate settings per function.

Put the worker methods in their own module where they are registered to Manticore:

var mc = require('manticore');
 
// directly add named function 
function myFunc1(params) {
    return heavyStuff(params);
}
mc.registerTask(myFunc1);
 
// add anonymous function 
mc.registerTask('myFunc2', function(params) {
    return heavyStuff(params);
});

There are different ways to return values:

// does it run syncronous? 
function myFunc1(params) {
    return heavyStuff(params);
}
 
// maybe use the node-style callback? 
function myFunc2(paramscallback) {
    heavyStuff(params, function(errresult) {
        callback(err, result);
    });
}
 
// or return a Promise? 
function myFunc3(params) {
    return heavyStuff(params).then(function(res) {
        return someMoreWork(res)
    };
}

Register in bulk:

// add named functions as array 
mc.registerTasks([
    myFunc1,
    myFunc2,
    myFunc3
]);
 
// register the methods as an object to redefine the name 
// - protip: use the module.exports object 
mc.registerTasks({
    myFuncA: myFunc1
    myFuncB: myFunc2
    myFuncC: myFunc3
});

Create a pool in the main app:

var mc = require('manticore');
 
var pool = mc.createPool({
    modulePath: require.resolve('./worker'),
    concurrent: 4
});

Then run the methods by name, pass a parameter value and get a Promise:

pool.run('myFunc1', myParams).then(function(res) {
    // got results 
}, function(err) {
    // oops 
});

For convenience get a curried function:

var func1 = pool.curried('myFunc1');
 
func1(params).then(function(res) {
    // got results 
});

Pro-tip: for serious bulk processing use Promise.all() (in Bluebird this is fun with Promise.map() etc).

Promise.all(myArray.map(pool.curried('myFunc1'))).then(function(results) {
    // got all the results 
});

That's it! :+1:

var pool = mc.createPool({
    // path to the worker module. pro-tip: use require.resolve()
    worker: string;
    
    // maximum amount of worker processes
    // - defaults: require('os').cpus().length
    // tip: when running on many cores leave 1 core free for main process: require('os').cpus().length -1
    concurrent?: number;
    // maximum amount of jobs to pass to each worker
    // set this to a higher value if your jobs are async and IO-bound
    // - default: 1
    paralel?: number;
    // maximum retries if a worker fails
    attempts?: number;
 
    // worker idle timeout in miliseconds, shuts down workers that are idling
    idleTimeout?: number;
    
    // emit 'status' events, handy for debugging
    emit?: boolean;
    // console.log status events for debugging
    log?: boolean;
});

Manticore is written in TypeScript and compiled with Grunt.

For TypeScript user there is a .d.ts file both in the repo and bundled in the npm package (also exported in package.json).

Install development dependencies in your git checkout:

$ npm install

Build and run tests using grunt:

$ grunt test

See the Gruntfile.js for additional commands.

They are welcome but please discuss in the issues before you commit to large changes. If you send a PR make sure you code is idiomatic and linted.

  • 0.2.0 - Transfer data over non-blocking pipes, renamed modulePath option to worker.
  • 0.1.0 - First release.

Copyright (c) 2014 Bart van der Schoor @ Bartvds

Licensed under the MIT license.