Have ideas to improve npm?Join in the discussion! »

    weblas

    0.9.1 • Public • Published

    logo

    GPU accelerated Javascript. Numerical computing in your browser with performance comparable to native.

    Currently includes hundreds of unit tests, which verify correctness on hundreds of millions of data points.

    Operations

    Our focus is on numerical operations useful for neural networks and machine learning. So far, we've got 32-bit versions of each of these:

    • sscal - Matrix (and Vector) Scale (with addition)
    • sgemm - Matrix Multiply
    • sdwns - Matrix (and Image) Downsample (for Max Pooling)
    • sclmp - Matrix clamp (for ReLU)

    Don't see what you need? Give a 👍 to an existing issue or create a new one!

    Usage

    First, include the weblas.js file (from a release or the dist directory).

    <script type="text/javascript" src="weblas.js"></script>

    Then use it like this.

    <script>
     
     
    var h1 = 1024, w1 = 1024,
        h2 = 1024, w2 = 1024;
     
    var A = new Float32Array(h1 * w1);
    var B = new Float32Array(h2 * w2);
     
    // fill A and B with science
     
    var M = h1,
        N = w2,
        K = h2; // must match w1
     
    var alpha = 1.0;
    var beta = 0.0;
    var C = new Float32Array(w2)      // specialized for neural net bias calculation
     
    // result will contain matrix multiply of A x B (times alpha)
    result = weblas.sgemm(M, N, K, alpha, A, B, beta, C);
     
    </script> 

    Pipeline Mode

    Pipeline mode gives (sometimes very large) increases in performance by leaving data in GPU memory. A demo illustrating performance on a deep neural net can be found here.

    Here's a basic example:

    // create Tensor containers for interacting directly with GPU memory
    var t0 = weblas.pipeline.Tensor([M, K], data0);
    // second matrix must be transposed
    var t1 = weblas.pipeline.Tensor([N, K], weblas.util.transpose(K, N, data1));
    var t2 = weblas.pipeline.Tensor([1, N], data2);
    var alpha = 1.0;
    var beta = 0.5;
     
    /* NOTE: pipeline.sgemm takes a transpose matrix in the
      second slot (t1 here)
      (this requirement allows for improved performance)
     */
    var t3 = weblas.pipeline.sgemm(alpha, t0, t1, beta, t2);
     
    // result is a Float32Array
    var result = t3.transfer();

    More information can be found on the wiki Pipeline page.

    Testing

    Unit tests and benchmarks both require browserify and testling.

    Install with:

    npm install -g browserify
    npm install -g testling
    

    Unit Tests

    All operations have unit test coverage. Unit tests use data generated outside the browser (to verify correctness). Generating the data requires python and the modules in requirements.txt.

    With pip installed run:

    pip install -r requirements.txt
    

    Then, to generate the data, run:

    npm run data
    

    Then, run the unit tests with:

    npm test
    

    OS Setup

    If the tests won't run, try this (it restores the default npm browser setting)

    OSX

    npm config set browser open
    

    Linux

    npm config set browser xdg-open
    

    Windows

    npm config set browser start
    

    Benchmarks

    After installing browserify and testling, run the benchmarks with:

    npm run benchmark
    

    results

    weblas@0.6.0

    TAP version 13
    ok 1 128x128 . 128x128
    # 316 ops/sec  ±4.80%  n = 51 µ = 3ms
    ok 2 128x256 . 256x128
    # 280 ops/sec  ±6.15%  n = 40 µ = 4ms
    ok 3 256x256 . 256x256
    # 171 ops/sec  ±14.79%  n = 47 µ = 6ms
    ok 4 512x256 . 256x512
    # 101 ops/sec  ±6.68%  n = 50 µ = 10ms
    ok 5 256x512 . 512x256
    # 139 ops/sec  ±3.64%  n = 49 µ = 7ms
    ok 6 512x512 . 512x512
    # 61.61 ops/sec  ±3.14%  n = 42 µ = 16ms
    ok 7 513x513 . 513x513
    # 52.92 ops/sec  ±8.82%  n = 49 µ = 19ms
    ok 8 1024x512 . 512x1024
    # 34.99 ops/sec  ±4.86%  n = 38 µ = 29ms
    ok 9 512x1024 . 1024x512
    # 52.03 ops/sec  ±2.66%  n = 47 µ = 19ms
    ok 10 1024x1024 . 1024x1024
    # 23.27 ops/sec  ±12.70%  n = 34 µ = 43ms
    ok 11 2048x2048 . 2048x2048
    # 4.89 ops/sec  ±1.82%  n = 17 µ = 204ms
    
    1..11
    # tests 11
    # pass  11
    
    # ok
    

    more information about benchmarks (including test configuration) can be found on the wiki.

    Donations

    Want to see more happen here? Contribute on

    Patreon

    Install

    npm i weblas

    DownloadsWeekly Downloads

    2

    Version

    0.9.1

    License

    MIT

    Last publish

    Collaborators

    • avatar