    ECMAScript RDFa Parser

    A fairly modular, cross-platform, extensible package to extract RDF information from RDFa-enabled documents.


    • Sometimes extracts RDF statements
    • Passes all compliant XML tests
    • Walks DOM tree to generate an RDF graph
    • Extensible design for parsing supersets of RDFa and supporting new media types


    • Consume DOM nodes or SAX events
    • Output a complete graph or "Statement" events
    • Support custom types of nodes (e.g. variables)
    • Query for DOM nodes by data they contain
    • Linting of RDFa documents for:
      • check property order
      • reverse/forward incompatability
      • shadowed definitions
      • defining over registered URI schemes
      • non-normalized IRI references
      • Invalid IRIs
      • Invalid CURIEs
      • Term is used without an active vocabulary


    Parse a DOM tree into an RDF graph

    const rdfa = require('rdfa');
    const fs = require('fs');
    const DOMParser = require('xmldom').DOMParser;
    const filepath = './test/';
    const document = new DOMParser().parseFromString(fs.readFileSync(filepath, 'UTF-8'), 'text/xml');
    const result = rdfa.parseDOM(rdfa.RDFaXMLParser, '', document);
    result.outputGraph.forEach(function(n){ console.log(n.toTurtle()); });
    <> <> "Mark Birbeck" .

    Support for standards-based XML parsers

    There is no included document parser, instead supports reading from a DOM (W3C or compatible), or a SAX-like event stream.

    Tests are written for the following packages:

    Support for different media types

    Supports plain XML, HTML, and SVG out-of-the-box, and is extendible for new host languages.

    Supported host languages:

    • RDFaXMLParser - for plain XML documents
    • RDFaXHTMLParser - for HTML documents delivered in application/xhtml+xml
    • RDFaHTMLParser - for HTML documents delivered in text/html

    File index

    • bin/httpd.js - an HTTP server for use with the RDFa Test Suite
    • bin/rdfa-turtle.js - read a file and print extracted triples
    • defaults.json - list of RDFa Initial Context vocabulary & terms
    • index.js - the bulk of the logic
    • test/suite.js - runs the library against a local copy of the RDFa test suite


    parseDOM(Processor, base, document, options)

    Extracts RDF statements out of a DOM document document, assuming a URI base base (always ignoring the URI in the DOM node, if any).

    Returns an RDFaParser instance (see below).

    • Processor: a reference to a subclass of RDFaParser
    • base: the URI base for the document (where the document was downloaded from)
    • document: DOM document
    • options: object with additional configuration
      • forceVersion: RDFa version to use. This is normally detected.
      • defaultLanguage: default language for when no language is specified by the document (use the language specified in the Content-Language header, if any)


    Maintains state during processing of a document. Pass a subclass of this to parseDOM as the first argument Processor.


    RDFaParser with the default RDF context loaded.


    RDFaParser extended with the parsing rules for application/xhtml+xml documents.


    RDFaParser extended with the parsing rules for text/html documents.


    Represents an RDFa processing context that's generated during processing of an element and passed to child elements.



