Ninjas Practicing Multidimensionality


    0.1.13 • Public • Published


    Duplicate data between multiple collections (Denormalization) is a common thing in MongoDB. It is efficient for searching, sorting and field projection.

    Handling duplicate data is a pain, you will have to create jobs to sync the data or update in place all the collections with the duplicated data.

    mongodb-data-sync solves this problem. With mongodb-data-sync you declare the dependencies in a logical place, for instance, with the schemas). mongodb-data-sync takes care of syncing the data in almost real-time.

    It uses the native MongoDB Change Streams in order to keep track of changes.

    Core Features

    1. It was designed to do all the synchronization with minimum overhead on the database. Most of the checks are done in memory.

    2. It uses the native MongoDB Change Streams in order to keep track of changes.

    3. It has a plan A and B to recover after a crash.

    4. It gives you an easy way to create dependencies with no worries of handling them.

    5. After declaring Your dependencies you can retroactively sync your data.

    6. from version 0.0.25 you can add a mysql dependency, this is one way dependency the refCollection must be a mongodb collection

    7. from version 0.0.29 you can now create triggers for update,insert,replace and delete


    mongodb-data-sync is still experimental

    Pros and cons of having duplicate data in multiple collection


    1. No need for joins.
    2. Index all fields.
    3. Faster and easier searching and sorting.


    1. More storage usage.
    2. Hard to maintain: Need to keep track of all the connections (this is what mongodb-data-sync comes to solve).
    3. Add write operations, every update will have to update multiple collections


    • MongoDB v4 or higher replaica set
    • nodejs 7.6 or higher


    mongodb-data-sync built from 2 separate parts.

    1. The engine (there should only be one) - a nodejs server application that's you have to run from your machine(you will see how to do it in the next steps). The engine runs all the updates and recovery logic. it was designed to work as a single process. It knows where to continue after a restart/crash. Don't try auto-scaling or set 2 containers for high availability. in short Don't use more than 1 engine,

    2. The SDK - responsible for managing the database dependencies of your application. It connects your app with the engine.


    The Instructions will address the 2 parts separately: the engine and the SDK.

    The engine


    npm install mongodb-data-sync -g

    Then, in the cmd run

    mongodb-data-sync --key "some key" --url "mongodb connection url"
      --debug                console log important information
      -p, --port <port>      server port. (default: 6500)
      -d, --dbname <dbname>  the database name for the package. (default: "mongodb_data_sync_db")
      -k, --key <key>        API key to use for authentication of the SDK requests, required
      -u, --url <url>        MongoDB connection url, required
      -h, --help             output usage information

    that's it for running the server, let's jump to the SDK


    You can look at the example on github

    npm install mongodb-data-sync --save


    first initialize the client , do it as soon as possible in your app

    const SynchronizerClient = require('mongodb-data-sync');
    // settings the communication between you app and the engine.
    // use this method the number of Database you want to work on
        // your Database name the package should do the synchronization on (required)
        dbName: 'mydb', 
        // the URL for package engine you run  (required),  
        engineUrl: 'http://localhost:6500',
        //the authentication key you declared on the engine application (required)
        apiKey: 'my cat is brown', 

    returns a Promise


    const synchronizerClientInstance = SynchronizerClient.getInstance({
     // your Database name you want work on
        dbName: 'mydb', 

    return an instance related to your db(its not a mongodb db instance) for dependencies operations


    // 'addDependency' allow you to declare a dependency between 2 collections
       // the dependent collection is the collection that need to get updated automatically  (required)
       // in case the dependent collection is a mysql table ,its should be writing like this mysql.dbname.tablename
       dependentCollection: 'orders',
       //the referenced collection is the collection that get updated from your application (required)
       refCollection: 'users',
       // the dependent collection field to connect with (required)
       localField: 'user_id',
       // the referenced collection field to connect with, default _id ,using other field then _id will cuz an extra join for each check (optional)
       foreignField:"_id" , // default
       // an object represents the fields who need to be updated.
       // the keys are the fields you want to be updated 
       // the values are the fields you want to take the value from (required)
       fieldsToSync: {
        // the engine uses a resume-token to know from where to continue the change stream. 
        // in case you had a crash for a long time and the oplog doesn't have this token anymore the engine will start update all the dependencies from the beginning,
        // it is recommended to supply an update field (if you have) so the engine will start sync only for dates after the crash 

    return Promise with the id of the Dependency


    // deletes a dependency based on id 

    return Promise


    // used to get the database dependencies

    return Promise with all your database dependencies


    // used to sync all the data in your database according to your dependencies.
    // most of the time this function needs to be called only if you add a new dependency on an old data 

    return Promise


        // the dependent collection to subscribe triggers on (required)
        dependentCollection : "orders",
        // the type of the trigger , can be insert,update,replace,delete (required)
        // when triggerType is update define which fields you want to trigger the update 
        triggerFields : [],
        // when knowledge set to true it will retry to fire the event until its get on ok http status
        knowledge : false, // default
        // the url the trigger will call 

    return Promise with the id of the Trigger


    // deletes a trigger based on id 

    return Promise


    npm i mongodb-data-sync

    DownloadsWeekly Downloads






    Unpacked Size

    67.2 kB

    Total Files


    Last publish


    • amit21