o-snapshot

2.0.0 • Public • Published

Snapshot

Objects serializer, for NodeJS and JS

Among other features, o-snapshot

  • preserves the identity of serialized objects, after deserialization

  • preserves the class of serialized objects, after deserialization

  • gives the programmer complete control over which classes can be serialized, and which classes can't

  • gives the programmer complete control over which attributes to serialize, and how

  • supports a simple mechanism to upgrade a snapshot from a previous version, saved using a different format

  • when used with classes, rather than plain Objects, the size of a serialized snapshot is an order of magnitude below the size of the same snapshot in plain json

  • supports circular references

  • supports the serialization of object surrogates

  • runs on both Node >= v10, and browsers >= ES9

Additionaly, used as an in-memory objects database, whether in browser or Node.js, o-snapshot

  • supports in-memory, atomic transactions

  • implements an extremely naive, and a bit of not entirely correct, garbage collection strategy

Installation

npm install o-snapshot

Documentation

Basic usage

o-snapshot has 2 main concepts

First one, it's the Serializer

A Serializer converts back and forth between instances of a single class, and json objects

Second one, it's the ObjectsDatabase

ObjectsDatabase takes an object, finds a serializer for it, and serializes or deserializes it, along with the objects it references

InMemoryDatabase is a concrete ObjectsDatabase

Example of use

You can browse the complete, functional example, in here

Say there are 2 classes

const { SequentialCollection } = require('o-toolbox')

class Book {
  constructor({ author, title, isbn }) {
    this.author = author
    this.title = title
    this.isbn = isbn
    this.editions = new SequentialCollection()
  }

  addBookEdition({ bookEdition }) {
    bookEdition._setBook({ book: this })

    this.editions.add( bookEdition )
  }

  addAllBookEditions({ bookEditions }) {
    bookEditions.forEach( (bookEdition) => {
      this.addBookEdition({ bookEdition })
    })
  }

  getAuthor() {
    return this.author
  }

  getAllEditions() {
    return this.editions.copy()
  }

  getIsbn() {
    return this.isbn
  }

  getTitle() {
    return this.title
  }
}

module.exports = Book

and

const { SequentialCollection } = require('o-toolbox')

class BookEdition {
  constructor({ publisher, dateOfPublishing, numberOfPages }) {
    this.publisher = publisher
    this.dateOfPublishing = dateOfPublishing
    this.numberOfPages = numberOfPages
    this.copies = new SequentialCollection()
    this.book = null
  }

  addBookCopy({ bookCopy }) {
    bookCopy._setBookEdition({ bookEdition: this })

    this.copies.add( bookCopy )
  }

  addAllBookCopies({ bookCopies }) {
    bookCopies.forEach( (bookCopy) => {
      this.addBookCopy({ bookCopy })
    })
  }

  getAuthor() {
    return this.book.getAuthor()
  }

  _setBook({ book }) {
    this.book = book
  }
}

module.exports = BookEdition

Create a serializer for Book objects

const Book = require('../models/Book')

class BookSerializer {
  getClass () {
    return Book
  }

  getClassId () {
    return 'Book'
  }

  getClassVersion () {
    return 1
  }

  serialize ( book ) {
    const { title, author, isbn, editions } = book

    return [ title, author, isbn, editions ]
  }

  deserialize ({ attributes }) {
    const [ title, author, isbn, editions ] = attributes

    const book = new Book({ title, author, isbn })

    book.addAllBookEditions({ bookEditions: editions })

    return book
  }
}

module.exports = BookSerializer

and a serializer for BookEdition objects

const BookEdition = require('../models/BookEdition')

class BookEditionSerializer {
  getClass () {
    return BookEdition
  }

  getClassId () {
    return 'BookEdition'
  }

  getClassVersion () {
    return 1
  }

  serialize ( bookEdition ) {
    const { publisher, dateOfPublishing, numberOfPages, copies } = bookEdition
 
    return [ publisher, dateOfPublishing, numberOfPages, copies ]
  }

  deserialize ({ attributes }) {
    const [ publisher, dateOfPublishing, numberOfPages, copies ] = attributes

    const bookEdition = new BookEdition({ publisher, dateOfPublishing, numberOfPages })
    bookEdition.addAllBookCopies({ bookCopies: copies })

    return bookEdition
  }
}

module.exports = BookEditionSerializer

Serializers

A Serializer can be any object, as long as it defines the methods

getClass

Returns the class of the objects it serializes

getClassId

Returns the name of the class it serializes, or any other unique id

In a regular context, it may seem redundant to have two different methods to get a class, and its name

There are use cases where it's needed to, though

Such a case is when the serializer is evaluated in an environment that has no access to class names, like a deployment with its source code minified or obfuscated. That's a common case in Single Page Applications

Another one, is when a refactor changes the class name, but the application should preserve backwards compatibility with objects already serialized

getClassVersion

Returns the current version of the class

The class version is an integer, greater or equal than 1

Later in this document, there's an example that shows what to use the class version for

serialize

Takes an instance of the class, and returns either an array, or a json object, with the attributes to serialize

serialize() is a regular method, and can get the attributes using the object public interface. It does not need to know the object internals

It's possible to write serializers for objects defined in a third party library

serialize() can add, change or remove any attribute of the object being serialized

The attributes returned by .serialize() might include values, like Number and String, null, undefined, and instances of other classes

o-snapshot takes care of serializing referred instances too, if they have a serializer

deserialize

It does the opposite to the serialize method

It takes the serialized attributes, and creates and returns an instance of the class

The attributes it receives are already deserialized to their class instance form

deserialize() also takes a second parameter, the serialized class version, to upgrade the serialized object if it needs to

InMemoryDatabase

The entry point to an object serialization, or deserialization, is an InMemoryDatabase

Before using it, configure it with the serializers of choice:

const { InMemoryDatabase } = require('o-snapshot')

const inMemoryDatabase = new InMemoryDatabase()
      .register({ serializer: new BookSerializer() })
      .register({ serializer: new BookEditionSerializer() })
      .register({ serializer: new SequentialCollectionSerializer() })

Then, add an object, and take a snapshot

const book = new Book({
  title: 'Hamlet',
  author: 'William Shakespeare',
  isbn: '112233'
})

const snapshot = inMemoryDatabase
      .setAt({ key: 'root', value: book })
      .takeSnapshot()

It's possible to serialize more than one root object in the same snapshot, if it makes sense for the program

const snapshot = inMemoryDatabase
      .setAt({ key: 'users', value: users })
      .setAt({ key: 'products', value: products })
      .takeSnapshot()

To restore back a snapshot

inMemoryDatabase.restoreSnapshot({ snapshot })

const book = inMemoryDatabase.at({ key: 'root' })

How does it work

When InMemoryDatabase adds a Book object, it looks for the object class (specifically, it looks for the object.constructor) in its registered serializers

If it finds one, it calls serializer.serialize( object ), to get the list of the attributes to serialize

It does the same with each one of the returned attributes

Once it iterates through all the reference graph, starting at the given object, it flattens the objects graph to an array of json attributes

Objects identity

The identity of each object is preserved after its deserialization

Serialized objects don't need to implement an identity surrogate. It uses the identity of the object in memory

E.g.

Before the serialization of

const book = new Book({
  title: 'Hamlet',
  author: 'William Shakespeare',
  isbn: '112233'
})

const books = [ book, book, book ]

Array books contains 3 occurrences of the identical object. Each element in the array points to the same object, book

After its serialization

const { InMemoryDatabase } = require('o-snapshot')

const inMemoryDatabase = new InMemoryDatabase()
      .register({ serializer: new BookSerializer() })
      .register({ serializer: new BookEditionSerializer() })
      .register({ serializer: new SequentialCollectionSerializer() })

const snapshot = inMemoryDatabase
      .setAt({ key: 'root', value: books })
      .takeSnapshot()

and deserialization

const { InMemoryDatabase } = require('o-snapshot')

const inMemoryDatabase = new InMemoryDatabase()
      .register({ serializer: new BookSerializer() })
      .register({ serializer: new BookEditionSerializer() })
      .register({ serializer: new SequentialCollectionSerializer() })

inMemoryDatabase.restoreSnapshot({ snapshot })

const books = inMemoryDatabase.at({ key: 'root' })

the identity of the objects is preserved, and the new books Array also contains 3 occurrences of a single object, rather than 3 equal objects with different identities

During the serialization of a complex graph of objects, to preserve the invariant of the identity of objects may be significant. Otherwise, the program may have unexpected errors, like an object not emitting events as expected

In addition, it can reduce significantly the size of a snapshot, if it holds many refences to identical objects

E.g, a snapshot of a collection of product orders

{
  orders : [
    {
      user,
      products: [
        { product }, ...
      ]
    },
    ...
  ]
}

may have multiple references to identical products and users

Within the snapshot, each object is serialized, with its attributes, only once

References to other objects are serialized as ObjectReferences instead, using an Integer

Built-in serializers

Although it's quite possible, and even quite simple, to make a serializer capable to serialize any object at all, rather than to require the programmer to declare each serializer explicitly, it proved to be a troubled design

To explicitly declare serializers makes the program more secure, more simple to debug, and more easy to understand

In addition, it allows to use different serializers, in different contexts, for the same classes, and features like class versioning, and surrogate references

Some built-in types come with serializers out of the box

Objects of type

  • undefined
  • null
  • String
  • Number
  • Array
  • Object
  • Boolean

don't need to declare a custom serializer

Any other class, it does

Built-in serializers can be overriden, as well

Surrogate references

Say the application uses a global reference

// global variable
const localClock = new Clock()

user.setClock( clock )

const snapshot = inMemoryDatabase
  .setAt({ key: 'root', value: user })
  .takeSnapshot()

The restoration of the snapshot taken wouldn't preserve the identity of localClock object

It would create a new instance of Clock, different from the localClock

To preserve the reference to localClock object, the serializer must replace the reference by a surrogate object, and restore it back, at the time of its deserialization

const User = require('./User')

class UserSerializer {
  getClass() {
    return User
  }

  getClassId() {
    return 'User'
  }

  getClassVersion() {
    return 1
  }

  serialize(user) {
    const clock = 'localClock'
    return [ clock ]
  }

  deserialize({ attributes, classVersion }) {
    const [ clock ] = attributes
    const user = new User()

    user.setClock( global.localClock )

    return user
  }  
}

module.exports = UserSerializer

The same technique can be applied to serialize references to resources that may not persist between the time of object serialization, and the time of object deserialization, such as database connections, file handles, network sockets or GUI elements

E.g.

const User = require('./User')

class UserSerializer {
  constructor({ connectionPool }) {
    this.connectionPool = connectionPool
  }

  getClass() {
    return User
  }

  getClassId() {
    return 'User'
  }

  getClassVersion() {
    return 1
  }

  serialize( user ) {
    const name = user.getName()
    const databaseConfig = user.getDatabase().getConfig()

    return [ name, databaseConfig ]
  }

  deserialize({ attributes, classVersion }) {
    const [ name, databaseConfig ] = attributes

    const user = new User({ name })
    const databaseConnection = this.connectionPool.getDatabaseConnection( databaseConfig )

    user.setDatabase(databaseConnection)

    return user
  }  
}

module.exports = UserSerializer

Class versions and migrations

Once an application persists at least one object, a programmer needs to address a whole new set of problems

Say there's a class

class User {
  constructor({ lastName }) {
    this.lastName = lastName
  }

  ...
}

with its serializer

const User = require('./User')

class UserSerializer {
  getClass() {
    return User
  }

  getClassId() {
    return 'User'
  }

  getClassVersion() {
    return 1
  }

  serialize( user ) {
    const lastName = user.lastName

    return { lastName }
  }

  deserialize({ attributes, classVersion }) {
    const { lastName } = attributes

    return new User({ lastName })
  }  
}

module.exports = UserSerializer

A snapshot is taken at time t1

const snapshotAtTime1 = inMemoryDatabase
  .setAt({ key: 'root', value: user })
  .takeSnapshot()

then, at time t2, a refactor is made to User class

class User {
  constructor({ lastname }) {
    this.lastname = lastname
  }

  ...
}

Even it may not seem like that much of a change, the rename of the instance variable will make any snapshot of the User class taken before t2, to fail the next time it's restored

Issues appear because the application, or the environment where the application runs, may change between the time at which the snapshot is taken, and the time at which the snapshot is restored

Serializers use the getClassVersion method to deal with this kind of errors

To properly handle the upgrade, every time User class changes its attributes, the serializer also upgrades it's version number

const { ImportMethods } = require('o-toolbox')
const { SerializerVersionUpgrades } = require('o-snapshot')
const User = require('./User')

class UserSerializer {
  getClass() {
    return User
  }

  getClassId() {
    return 'User'
  }

  getClassVersion() {
    return 2            // <-- increased by 1 
  }

  serialize( user ) {
    const lastname = user.lastname

    return { lastname }
  }

  deserialize({ attributes, classVersion }) {
    if ( classVersion === 1 ) {
      const { lastName } = attributes
      const lastname = lastName
      return new User({ lastname })
    }

    const { lastname } = attributes

    return new User({ lastname })
  }
}

module.exports = UserSerializer

Note that getClassVersion increments its value by 1

Validations

It's a good idea to validate each attribute before each object serialization/deserialization

It may save a lot of debugging time

o-snapshot does not include any validation utility

Any assertions or validations library can be used within serialize and deserialize methods for that matter

The author of o-snapshot, that would be me, recomends o-check-list, for it's simplicity, ease of use, and adaptability to different contexts, since it works in both Node.js >= 10, and browser regular JS >= ES9

DoMe commands

DoMe commands are intended to be self-documented, please take a look at the files in DoMe/forDevelopment/inWindows, or in DoMe/forDevelopment/inDocker

Package Sidebar

Install

npm i o-snapshot

Weekly Downloads

5

Version

2.0.0

License

ISC

Unpacked Size

125 kB

Total Files

57

Last publish

Collaborators

  • haijindev