dynapack

A module bundler/loader for static and dynamic dependencies.

Dynapack is a javascript module bundler and client-side loader that solves the following problem. Given a dependency graph of static and dynamic dependencies, construct a set of module bundles such that

  • the number of bundles is minimized,
  • each module exists in only one bundle,
  • a client request for a dynamic dependency, D, returns only the static dependencies of D (recursively),
  • a module (bundle) is sent to a client only once (per session), and
  • bundles connected by static dependencies are sent in parallel.

Here is a complete working example of dynapack.

This project and document are a work in progress. Consider it alpha-stage. Please contribute!

I couldn't find a bundler/loader that satisfies all the requirements listed above! Specifically, other bundlers are ignoring what I call the dynamic dependency diamond. What the heck is that? With dotted lines as dynamic dependencies, solid lines as static dependencies, and arrows pointing from dependent module to dependency, consider the dependency graph:

This situation should lead to 4 bundles, one for each module. If a client possesses module a and requests b, it should receive b and d. However, if it instead requests c, it should receive c and d. Moreover, if a client requests b then c, the request for c should return only c.

This example is simplified for explanatory purposes. The RequireJS loader, in fact, does this, but on the module level and not in parallel for static dependencies (AFAIK). Dynapack handles the dynamic dependency diamond in the general case on the bundle level.

Audience

Dynapack was built with Node.js and isomorphism in mind; modules written in dynapack syntax will just run under Node.js. The following technique (introduced by example) was devised to fake asynchronous module loading in Node.js without breaking things; code like this will appear in modules written for dynapack:

var ensure = require('node-ensure');
 
// This is a dynamic dependency declaration, which tells dynapack to load this 
// module on demand in the browser. More on this later. 
var __m = './big-module' /*js*/;
 
// This resembles the CommonJS Modules/Async/A proposal. 
ensure([__m], function(err) {
  // Now we can require the module. 
  var m = require(__m);
});

See https://github.com/bauerca/node-ensure for documentation on node-ensure.

Installation

> npm install dynapack

There is also a command-line interface; installing it globally (npm install -g dynapack) will get you the dynapack command.

Usage

The current version of dynapack supports only Node.js-style syntax for static dependencies and only dynamic dependency declarations with node-ensure for dynamic dependencies. The following is a brief overview.

A static dependency is a dependency that exists at all times. It implies synchronous loading, and it suggests that the dependency should be delivered to a client along with the dependent module.

The CommonJS style is supported, in which a static dependency is a simple require statement.

var m = require('module');

If your app uses only this type of dependency, dynapack will function exactly like Browserify, and output a single bundle for your app. In this case, just use Browserify.

Dynamic dependencies are dependencies that should be loaded/executed only on demand. This type of dependency is relevant in the browser, where the downloading and execution of modules takes up precious user time.

Dynapack supports a new syntax called a dynamic dependency declaration (d3):

var __bm = './big-module' /*js*/;

where the important part of the above is the <string> /*js*/ combination. A d3 informs dynapack that the decorated string literal is in fact a path to a module whose source (plus the sources of all of its static dependencies) should be downloaded on demand at the discretion of the app (using node-ensure).

The advantage of a d3 is that it is recognized anywhere in module code (whereas other bundlers recognize dependency strings only as arguments to require calls or similar, which binds the dependency name to the function call).

(I like to use the double underscore for a d3 variable because of its similarity to __filename and __dirname in Node.js, but you can use whatever).

Dynapack relies on node-ensure, a dead simple library that provides an asynchronous dependency loading protocol similar to the CommonJS Modules/Async/A spec proposal.

It differs from the spec, though, so that things just work in Node.js.

var ensure = require('node-ensure');
var __superagent = 'superagent' /*js*/;
 
ensure([__superagent], function(err) {
  var request = require(__superagent);
 
  // ... 
});

Configuration options and defaults are:

var packer = new Dynapack({
  entries: undefined, // required! 
  output: './bundles',
  prefix: '/',
  bundle: true,
  dynamicLabels: 'js',
  builtins: require('browserify/lib/builtins'),
  globalTransforms: []
});

Use the instance like this:

packer.run(function(errbundles) {
  // Inspect the bundle contents. 
  console.log(bundles);
  
  packer.write(function(errentries) {
    // Use the entry-point/bundles maps to serve web-pages and 
    // bundles. 
    console.log(entries);
  });
});

Passed to the Dynapack constructor.

This option is the only required option and must identify the (client-side) entry point(s) of the app to bundle. The different allowed formats will affect only the ouput of the write() method.

A string is given when there is only one app entry point (e.g. "single-page" apps); an array is given when there are many. Alternatively, a mapping between custom ids and entry point paths will replace the auto-generated ids used by default by write().

For example, when entries is './client.js', the write method would return something like:

{
  "./client.js": [
    "./bundles/entry.0.js",
    "./bundles/a.js"
  ]
}

whereas an entries like {app: './client.js'} would produce:

{
  "app": [
    "./bundles/entry.0.js",
    "./bundles/a.js"
  ]
}

The default value is "./bundles". This is where bundle files are saved.

This is the prefix under which javascript files will be served. This does not have to be the same host that serves the app; it could be a CDN somewhere, in which case, include the hostname in the prefix.

Should end with a slash. The default is '/'.

Defaults to true.

Set to false when developing and debugging. If false, Dynapack will treat each module as a "bundle" but otherwise act the same (as far as async loading in the browser is concerned). Browser loading of js will be much slower, however, since the number of "bundles" will skyrocket. Worth it.

Defaults to 'js'. This option permits changing the syntax of the comment part of a d3.

Defaults to require('browserify/lib/builtins'). See Browserify docs.

Defaults to empty array. See Browserify docs.

Available on a Dynapack instance.

The callback will receive either an error as first argument or the (fairly low-level) results of the bundling process as second.

This should be called after a successful run(). The callback will receive either an error as first argument or information regarding the bundles that should be embedded in the webpages (HTML) that deliver each app entry point.

All options are optional.

> dynapack \
    (-o|--output) <output-directory> \
    (-p|--prefix) <string> \
    (-d|--debug) \
    <entry-point> <entry-point> ...

For example

dynapack -d -o ./test-bundles -p /scripts/ app.js

where app.js is the sole entry point to the client-side version of your app. The bundles will be installed in the directory specified by the (-o|--output) option (defaults to ./bundles).

The command prints lots of stuff, including the groups of bundles that should be included in the webpage served for each entry point.

What follows is some general discussion on javascript-heavy web apps and module bundling to prepare for understanding the purpose of dynapack.

A modern javascript-heavy web app consists of a bunch of javascript files (called modules) all tied together by some kind of module-loading scheme like CommonJS (Node.js, Browserify) or AMD (RequireJS). Such an app can be modeled as a dependency tree, which we'll represent as this big triangle:

All the modules needed by the app are contained within the triangle. At the root of the tree is the entry point to the app, probably a router of some kind; down at the bottom are core modules that have no dependencies.

Single-page apps often have the client download the entire dependency tree (the whole triangle) before the app is displayed in the browser. RequireJS (for AMD-style modules) and Browserify (for CommonJS-style modules) are the standard methods for packaging up a single bundle like this.

Oftentimes, a module and its dependencies is a pretty big chunk of javascript and is used only in a few pages of an app. In this case, the developer might wish to exclude it from the main app bundle and have the client download it only when they need it. Using the triangle, we might illustrate this as such:

Now the isolated triangle is not downloaded with the main bundle, which has decreased in size to the trapezoid thingy.

At this point, we must realize that the app is not accurately represented by a dependency tree; it is, in fact, a directed dependency graph. This is because dependency branches may rejoin; for example, the root module may depend on two custom modules, each of which depends on jQuery. To visualize, the following two paths through the dependency graph could get to the excluded bundle in the above figure:

Great. Now let's look at bundle-splitting and minimizing initial page load times (we should mention that the necessity for snappy initial page loads may not apply to all apps). The basic principle is this: force the client to download only those modules it needs to display the target page.

Suppose that the following green triangle at the root of the graph comprises those modules common to all pages of the app. This group of modules should always be downloaded.

Now suppose that page 1 of the app uses the modules enclosed by both sub-triangles in the following figure

where the entry point to the page-1-specific modules is the root of the red (lower-left) sub-triangle. Page 2 has its own similarly-visualized set of modules.

At this point, it might seem reasonable to make three bundles, one for each triangle. However, we see that after a client visits both pages, they have traversed the following modules

which is the union of the modules needed to display pages 1 and 2, individually. Surely, we shouldn't download the same modules twice!

A fourth bundle is suggested by the intersection of the modules specific to pages 1 and 2 (the small purple triangle). Now, when a client initially visits page 1, it downloads the green (top) triangle, red (left) trapezoid, and small (purple) triangle; and when it visits page 2, it downloads the green (top) triangle, blue (right) trapezoid, and small (purple) triangle. If a client has already visited page 1, it needs to download only the blue (right) trapezoid to display the page.

But that's one more download! What about latency? Sure, but these requests for bundles can be made in parallel; the requests for all the bundles needed to display a page share the same latency (depending on the number of concurrent requests allowed by the browser).

The purple (small) triangle is sometimes called the "commons" bundle and, in RequireJS or webpack, is formed by hand by the developer via a config file. Dynapack strives to assemble these common bundles (and manage the parallel downloading of them on the client) for you, because in reality, your dependency graph has many "common" bundles (small purple triangles).

License

MIT