Nutritious Pomegranate Muffins

    @thi.ng/egf
    TypeScript icon, indicating that this package has built-in type declarations

    0.6.5 • Public • Published

    egf

    npm version npm downloads Twitter Follow

    This project is part of the @thi.ng/umbrella monorepo.

    About

    Extensible Graph Format.

    Striving to be both easily readable & writable for humans and machines, this line based, plain text data format and package supports:

    • Definition of different types of graph based data (e.g. RDF-style or Labeled Property Graph topologies)
    • Full support for cyclic references, arbitrary order (automatic forward declarations)
    • Choice of inlining referenced nodes for direct access or via special node ref values
    • Arbitrary property values (extensible via tagged literals and custom tag parsers a la EDN)
    • Optionally prefixed node and property IDs with (also optional) auto-expansion via declared prefixes (for Linked Data use cases)
    • Inclusion of sub-graphs from external files
    • Loading of individual property values from referenced file paths
    • Optionally GPG encrypted property values (where needed)
    • Multi-line values
    • Line comments
    • Configurable parser behavior & syntax feature flags
    • Hand-optimized parser, largely regexp free
    • Configurable GraphViz DOT export

    example graph

    (Source for this example graph is further below)

    Built-in tag parsers

    The following parsers for tagged property values are available by default. Custom parsers can be provided via config options.

    Tag Description Result
    #base64 Base64 encoded binary data Uint8Array
    #date Date.parse() compatible string (e.g. ISO8601) Date
    #file File path to read value from string
    #gpg Calls gpg to decrypt given armored string string
    #hex hex 32bit int (no prefix) number
    #json Arbitrary JSON value any
    #list Whitespace separated list string[]
    #num Floating point value (IEEE754) number

    Note: In this reference implementation, the #file and #gpg tag parsers are only available in NodeJS.

    Status

    ALPHA - bleeding edge / work-in-progress

    Search or submit any issues for this package

    You're strongly encouraged to update to at least v0.4.0 to avoid the potential of arbitrary code execution in older versions when decrypting #gpg-tagged property values. A security advisory will be published ASAP. A fix has been deployed already.

    Feature ideas

    (Non-exhaustive list)

    • [x] VSCode syntax highlighting
    • [x] JSON -> EGF conversion
    • [ ] Async tag parsing
    • [ ] URL support for #file tag
    • [ ] Tag declarations & tag parser import from URL (needs trust config opts)
    • [ ] #md tag parser for markdown content
    • [ ] #gpg fallback behavior options

    Installation

    yarn add @thi.ng/egf

    ES module import:

    <script type="module" src="https://cdn.skypack.dev/@thi.ng/egf"></script>

    Skypack documentation

    For Node.js REPL:

    # with flag only for < v16
    node --experimental-repl-await
    
    > const egf = await import("@thi.ng/egf");
    

    Package sizes (gzipped, pre-treeshake): ESM: 2.85 KB

    Dependencies

    API

    Generated API docs

    TODO - Full docs forthcoming...

    Basic example

    ; file: readme.egf
    
    ; prefix declaration (optional feature)
    @prefix thi: thi.ng/
    
    ; a single node/subject definition
    ; properties are indented
    ; `thi:` prefix will be expanded
    thi:egf
        type project
        ; tagged value property (here: node ref)
        part-of -> thi:umbrella
        status alpha
        description Extensible Graph Format
        url https://thi.ng/egf
        creator -> toxi
        ; multi-line value
        ; read as whitespace separated list/array (via #list)
        tag #list >>>
    graph
    extensible
    format
    linked-data
    <<<
    
    thi:umbrella
        type project
        url https://thi.ng/umbrella
        creator -> toxi
    
    toxi
        type person
        name Karsten Schmidt
        location London
        account -> toxi@twitter
        account -> postspectacular@gh
    
    toxi@twitter
        type account
        name @toxi
        url http://twitter.com/toxi
    
    postspectacular@gh
        type account
        name @postspectacular
        url http://github.com/postspectacular
    
    import { parseFile } from "@thi.ng/egf";
    
    // enable prefix expansion in parser
    const graph = parseFile("readme.egf", { opts: { prefixes: true } }).nodes;
    
    console.log(Object.keys(graph));
    // [
    //  'thi.ng/egf',
    //  'thi.ng/umbrella',
    //  'toxi',
    //  'toxi@twitter',
    //  'postspectacular@gh'
    // ]
    
    console.log(graph.toxi);
    // {
    //   '$id': 'toxi',
    //   type: 'person',
    //   name: 'Karsten Schmidt',
    //   location: 'London',
    //   account: [
    //     {
    //       '$ref': 'toxi@twitter',
    //       deref: [Function: deref],
    //       equiv: [Function: equiv]
    //     },
    //     {
    //       '$ref': 'postspectacular@gh',
    //       deref: [Function: deref],
    //       equiv: [Function: equiv]
    //     }
    //   ]
    // }
    
    // in this example inlining of referenced nodes is disabled (default)
    // therefore refs are encoded as objects implementing the `IDeref` interface
    // to obtain the referenced node
    console.log(graph.toxi.account[0].deref());
    // {
    //   '$id': 'toxi@twitter',
    //   type: 'account',
    //   name: '@toxi',
    //   url: 'http://twitter.com/toxi'
    // }

    Syntax

    EGF is a UTF-8 plain text format and largely line based, though supports multi-line values. An EGF file consists of node definitions, each with zero or more properties and their (optionally tagged) values. EGF does not prescribe any other schema or structure and it's entirely up to the user to e.g. allow properties themselves to be defined as nodes with their own properties, thus allowing the definition of LPG (Labeled Property Graph) topologies as well.

    ; Comment line
    
    ; First node definition
    node1
        ; property with string value
        prop1 value
        ; property with reference to another node
        prop2 -> node2
        ; property with tagged value
        prop3 #tag value
        prop4 <<< long, potentially
    multiline
    value >>>
        prop5 #tag <<< tagged multi-line value >>>
    
    node2
        ; property comment
        prop1 value
    ...
    

    Grammar

    A full grammar definition is forthcoming. In the meantime, please see a somewhat outdated older version and related comments in #234 for more details.

    Node references

    Properties with reference values to another node constitute edges in the graph. References are encoded via property -> nodeid.

    The following graph defines two nodes with circular references between them. Each node has a literal (string, by default) property name and a reference property knows to another node (via its ID). The order of references is arbitrary and the parser will automatically produce forward declarations for nodes not yet known.

    alice
        name Alice
        knows -> bob
    
    bob
        name Robert
        knows -> alice
    

    Using default parser options, this produces an object as follows. Note, the references are encoded as objects with a $ref property and implement the IDeref and IEquiv interfaces defined in the @thi.ng/api package.

    {
      alice: {
        '$id': 'alice',
        name: 'Alice',
        knows: {
          '$ref': 'bob',
          deref: [Function: deref],
          equiv: [Function: equiv]
        }
      },
      bob: {
        '$id': 'bob',
        name: 'Robert',
        knows: {
          '$ref': 'alice',
          deref: [Function: deref],
          equiv: [Function: equiv]
        }
      }
    }
    // access bob's name via alice
    graph.alice.knows.deref().name
    // "Robert"

    If node resolution is enabled (via the resolve option) in the parser, the referenced nodes will be inlined directly and produce circular references in the JS result object. In many cases this more desirable and fine, however will stop the graph from being serializable to JSON (for example).

    {
      alice: <ref *1> {
        '$id': 'alice',
        name: 'Alice',
        knows: { '$id': 'bob', name: 'Robert', knows: [Circular *1] }
      },
      bob: <ref *2> {
        '$id': 'bob',
        name: 'Robert',
        knows: <ref *1> {
          '$id': 'alice',
          name: 'Alice',
          knows: [Circular *2]
        }
      }
    }
    

    Prefixed IDs

    To enable namespacing and simplify re-use of existing data vocabularies, we're borrowing from existing Linked Data formats & tooling to allow node and property IDs to be defined in a prefix:name format alongside @prefix declarations. Such prefix IDs will be expanded during parsing and usually form complete URIs, but could expand to any string. The various (50+) commonly used Linked Data vocabulary prefixes bundled in @thi.ng/prefixes are available by default, though can be overridden, of course...

    ; prefix declaration
    @prefix thi: http://thi.ng/
    
    thi:toxi
        rdf:type -> foaf:person
    

    Result:

    {
      'thi.ng/toxi': {
        '$id': 'thi.ng/toxi',
        'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': {
          '$id': 'http://xmlns.com/foaf/0.1/person'
        }
      },
      'http://xmlns.com/foaf/0.1/person': {
        '$id': 'http://xmlns.com/foaf/0.1/person'
      }
    }

    Includes

    Currently in NodeJS only, external graph definitions can be included in the main graph via the @include directive. Any @prefix declarations in the included file will only be available in that file, however will inherit any pre-existing prefixes declared in the main file.

    Relative file paths will be relative to the path of the currently processed file:

     |- include
     |  |- sub1.egf
     |  |- sub2.egf
     |- main.egf
    

    (These examples make use of the schema.org ontology)

    ; main.egf
    ; declare an empty prefix
    @prefix : http://thi.ng/
    
    @include include/sub1.egf
    
    ; use empty prefix for this node
    :toxi
        rdf:type -> schema:Person
    
    ; sub1.egf
    @include sub2.egf
    
    :sub1.egf
        rdf:type -> schema:Dataset
        schema:dateCreated #date 2020-07-19
    
    ; sub2.egf
    
    :sub2.egf
        rdf:type -> schema:Dataset
        schema:creator -> :toxi
    

    Parsing the main.egf file (with node resolution/inlining and pruning) produces:

    {
      'http://thi.ng/sub2.egf': {
        '$id': 'http://thi.ng/sub2.egf',
        'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': { '$id': 'http://schema.org/Dataset' },
        'http://schema.org/creator': {
          '$id': 'http://thi.ng/toxi',
          'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': { '$id': 'http://schema.org/Person' }
        }
      },
      'http://thi.ng/toxi': {
        '$id': 'http://thi.ng/toxi',
        'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': { '$id': 'http://schema.org/Person' }
      },
      'http://thi.ng/sub1.egf': {
        '$id': 'http://thi.ng/sub1.egf',
        'http://www.w3.org/1999/02/22-rdf-syntax-ns#type': { '$id': 'http://schema.org/Dataset' },
        'http://schema.org/dateCreated': 2020-07-19T00:00:00.000Z
      }
    }

    EGF generation / serialization

    Complying JS objects can be converted to EGF using the toEGF() function. This function takes an iterable of Node objects, optional prefix mappings and an optional property serialization function to deal with custom tagged values. The default property formatter (toEGFProp()) handles various values for built-in tags and can be used in combination with any additional user provided logic.

    import { rdf, schema } from "@thi.ng/prefixes";
    
    const res = toEGF([
        {
          $id: "thi:egf",
          "rdf:type": { $ref: "schema:SoftwareSourceCode" },
          "schema:isPartOf": { $id: "http://thi.ng/umbrella" },
          "schema:dateCreated": new Date("2020-02-16")
        },
        {
          $id: "thi:umbrella",
          "rdf:type": { $ref: "schema:SoftwareSourceCode" },
          "schema:programmingLanguage": "TypeScript"
        }
      ],
      // prefix mappings (optional)
      {
        thi: "http://thi.ng/",
        schema,
        rdf
      }
      // property serializer (optional)
      toEGFProp
    );
    @prefix thi: http://thi.ng/
    @prefix schema: http://schema.org/
    @prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
    
    thi:egf
        rdf:type -> schema:SoftwareSourceCode
        schema:isPartOf -> thi:umbrella
        schema:dateCreated #date 2020-02-16T00:00:00.000Z
    
    thi:umbrella
        rdf:type -> schema:SoftwareSourceCode
        schema:programmingLanguage TypeScript
    

    Authors

    Karsten Schmidt

    If this project contributes to an academic publication, please cite it as:

    @misc{thing-egf,
      title = "@thi.ng/egf",
      author = "Karsten Schmidt",
      note = "https://thi.ng/egf",
      year = 2020
    }

    License

    © 2020 - 2021 Karsten Schmidt // Apache Software License 2.0

    Install

    npm i @thi.ng/egf

    DownloadsWeekly Downloads

    3

    Version

    0.6.5

    License

    Apache-2.0

    Unpacked Size

    53.2 kB

    Total Files

    20

    Last publish

    Collaborators

    • thi.ng