Share your code. npm Orgs help your team discover, share, and reuse code. Create a free org »

    daffpublic

    Build Status NPM version Gem Version PyPI version PHP version Bower version Badge count

    daff: data diff

    This is a library for comparing tables, producing a summary of their differences, and using such a summary as a patch file. It is optimized for comparing tables that share a common origin, in other words multiple versions of the "same" table.

    For a live demo, see:

    http://paulfitz.github.com/daff/

    Install the library for your favorite language:

    npm install daff -g  # node/javascript 
    pip install daff     # python 
    gem install daff     # ruby 
    composer require paulfitz/daff-php  # php 
    install.packages('daff') # R wrapper by Edwin de Jonge 
    bower install daff   # web/javascript 

    Other translations are available here:

    https://github.com/paulfitz/daff/releases

    Or use the library to view csv diffs on github via a chrome extension:

    https://github.com/theodi/csvhub

    The diff format used by daff is specified here:

    http://dataprotocols.org/tabular-diff-format/

    This library is a stripped down version of the coopy toolbox (see http://share.find.coop). To compare tables from different origins, or with automatically generated IDs, or other complications, check out the coopy toolbox.

    The program

    You can run daff/daff.py/daff.rb as a utility program:

    $ daff
    daff can produce and apply tabular diffs.
    Call as:
      daff [--output OUTPUT.csv] a.csv b.csv
      daff [--output OUTPUT.csv] parent.csv a.csv b.csv
      daff [--output OUTPUT.ndjson] a.ndjson b.ndjson
      daff patch [--inplace] [--output OUTPUT.csv] a.csv patch.csv
      daff merge [--inplace] [--output OUTPUT.csv] parent.csv a.csv b.csv
      daff trim [--output OUTPUT.csv] source.csv
      daff render [--output OUTPUT.html] diff.csv
      daff git
      daff version
    
    The --inplace option to patch and merge will result in modification of a.csv.
    
    If you need more control, here is the full list of flags:
      daff diff [--output OUTPUT.csv] [--context NUM] [--all] [--act ACT] a.csv b.csv
         --context NUM: show NUM rows of context
         --all:         do not prune unchanged rows
         --act ACT:     show only a certain kind of change (update, insert, delete)
    
      daff diff --git path old-file old-hex old-mode new-file new-hex new-mode
         --git:         process arguments provided by git to diff drivers
    
      daff render [--output OUTPUT.html] [--css CSS.css] [--fragment] [--plain] diff.csv
         --css CSS.css: generate a suitable css file to go with the html
         --fragment:    generate just a html fragment rather than a page
         --plain:       do not use fancy utf8 characters to make arrows prettier
    

    Formats supported are CSV, TSV, and ndjson.

    Using with git

    Run daff git csv to install daff as a diff and merge handler for *.csv files in your repository. Run daff git for instructions on doing this manually. Your CSV diffs and merges will get smarter, since git will suddenly understand about rows and columns, not just lines:

    Example CSV diff

    The library

    You can use daff as a library from any supported language. We take here the example of Javascript. To use daff on a webpage, first include daff.js:

    <script src="daff.js"></script>

    Or if using node outside the browser:

    var daff = require('daff');

    For concreteness, assume we have two versions of a table, data1 and data2:

    var data1 = [
        ['Country','Capital'],
        ['Ireland','Dublin'],
        ['France','Paris'],
        ['Spain','Barcelona']
    ];
    var data2 = [
        ['Country','Code','Capital'],
        ['Ireland','ie','Dublin'],
        ['France','fr','Paris'],
        ['Spain','es','Madrid'],
        ['Germany','de','Berlin']
    ];

    To make those tables accessible to the library, we wrap them in daff.TableView:

    var table1 = new daff.TableView(data1);
    var table2 = new daff.TableView(data2);

    We can now compute the alignment between the rows and columns in the two tables:

    var alignment = daff.compareTables(table1,table2).align();

    To produce a diff from the alignment, we first need a table for the output:

    var data_diff = [];
    var table_diff = new daff.TableView(data_diff);

    Using default options for the diff:

    var flags = new daff.CompareFlags();
    var highlighter = new daff.TableDiff(alignment,flags);
    highlighter.hilite(table_diff);

    The diff is now in data_diff in highlighter format, see specification here:

    http://share.find.coop/doc/spec_hilite.html

    [ [ '!', '', '+++', '' ],
      [ '@@', 'Country', 'Code', 'Capital' ],
      [ '+', 'Ireland', 'ie', 'Dublin' ],
      [ '+', 'France', 'fr', 'Paris' ],
      [ '->', 'Spain', 'es', 'Barcelona->Madrid' ],
      [ '+++', 'Germany', 'de', 'Berlin' ] ]

    For visualization, you may want to convert this to a HTML table with appropriate classes on cells so you can color-code inserts, deletes, updates, etc. You can do this with:

    var diff2html = new daff.DiffRender();
    diff2html.render(table_diff);
    var table_diff_html = diff2html.html();

    For 3-way differences (that is, comparing two tables given knowledge of a common ancestor) use daff.compareTables3 (give ancestor table as the first argument).

    Here is how to apply that difference as a patch:

    var patcher = new daff.HighlightPatch(table1,table_diff);
    patcher.apply();
    // table1 should now equal table2

    For other languages, you should find sample code in the packages on the Releases page.

    Supported languages

    The daff library is written in Haxe, which can be translated reasonably well into at least the following languages:

    Some translations are done for you on the Releases page. To make another translation, or to compile from source first follow the Haxe getting started tutorial for the language you care about. At the time of writing, if you are on OSX, you should install haxe using brew install haxe --HEAD. Then do one of:

    make js
    make php
    make py
    make java
    make cs
    make cpp
    

    For each language, the daff library expects to be handed an interface to tables you create, rather than creating them itself. This is to avoid inefficient copies from one format to another. You'll find a SimpleTable class you can use if you find this awkward.

    Other possibilities:

    API documentation

    Sponsors

    the zen of venn The [Data Commons Co-op](http://datacommons.coop), "perhaps the geekiest of all cooperative organizations on the planet," has given great moral support during the development of `daff`. Donate a multiple of `42.42` in your currency to let them know you care: [http://datacommons.coop/donate/](http://datacommons.coop/donate/)

    Reading material

    License

    daff is distributed under the MIT License.

    install

    npm i daff

    Downloadsweekly downloads

    210

    version

    1.3.29

    license

    MIT

    repository

    githubgithub

    last publish

    collaborators

    • avatar