Nifty Procrastination Machine

    rdfa

    0.0.10 • Public • Published

    ECMAScript RDFa Parser

    A fairly modular, cross-platform, extensible package to extract RDF information from RDFa-enabled documents.

    Features:

    • Sometimes extracts RDF statements
    • Passes all compliant XML tests
    • Walks DOM tree to generate an RDF graph
    • Extensible design for parsing supersets of RDFa and supporting new media types

    Goals:

    • Consume DOM nodes or SAX events
    • Output a complete graph or "Statement" events
    • Support custom types of nodes (e.g. variables)
    • Query for DOM nodes by data they contain
    • Linting of RDFa documents for:
      • check property order
      • reverse/forward incompatability
      • shadowed definitions
      • defining over registered URI schemes
      • non-normalized IRI references
      • Invalid IRIs
      • Invalid CURIEs
      • Term is used without an active vocabulary

    Features

    Parse a DOM tree into an RDF graph

    const rdfa = require('rdfa');
    const fs = require('fs');
    const DOMParser = require('xmldom').DOMParser;
     
    const filepath = './test/rdfa.github.io/test-suite/test-cases/rdfa1.1-lite/xhtml1/0021.xhtml';
    const document = new DOMParser().parseFromString(fs.readFileSync(filepath, 'UTF-8'), 'text/xml');
    const result = rdfa.parseDOM(rdfa.RDFaXMLParser, 'http://example.com/', document);
    result.outputGraph.forEach(function(n){ console.log(n.toTurtle()); });
    <http://example.com/> <http://purl.org/dc/elements/1.1/creator> "Mark Birbeck" .
    

    Support for standards-based XML parsers

    There is no included document parser, instead supports reading from a DOM (W3C or compatible), or a SAX-like event stream.

    Tests are written for the following packages:

    Support for different media types

    Supports plain XML, HTML, and SVG out-of-the-box, and is extendible for new host languages.

    Supported host languages:

    • RDFaXMLParser - for plain XML documents
    • RDFaXHTMLParser - for HTML documents delivered in application/xhtml+xml
    • RDFaHTMLParser - for HTML documents delivered in text/html

    File index

    • bin/httpd.js - an HTTP server for use with the RDFa Test Suite
    • bin/rdfa-turtle.js - read a file and print extracted triples
    • defaults.json - list of RDFa Initial Context vocabulary & terms
    • index.js - the bulk of the logic
    • test/suite.js - runs the library against a local copy of the RDFa test suite

    API

    parseDOM(Processor, base, document, options)

    Extracts RDF statements out of a DOM document document, assuming a URI base base (always ignoring the URI in the DOM node, if any).

    Returns an RDFaParser instance (see below).

    • Processor: a reference to a subclass of RDFaParser
    • base: the URI base for the document (where the document was downloaded from)
    • document: DOM document
    • options: object with additional configuration
      • forceVersion: RDFa version to use. This is normally detected.
      • defaultLanguage: default language for when no language is specified by the document (use the language specified in the Content-Language header, if any)

    RDFaParser

    Maintains state during processing of a document. Pass a subclass of this to parseDOM as the first argument Processor.

    RDFaXMLParser

    RDFaParser with the default RDF context loaded.

    RDFaXHTMLParser

    RDFaParser extended with the parsing rules for application/xhtml+xml documents.

    RDFaHTMLParser

    RDFaParser extended with the parsing rules for text/html documents.

    RDFaContext

    Represents an RDFa processing context that's generated during processing of an element and passed to child elements.

    Keywords

    Install

    npm i rdfa

    DownloadsWeekly Downloads

    2

    Version

    0.0.10

    License

    Unlicense

    Unpacked Size

    55.1 kB

    Total Files

    21

    Last publish

    Collaborators

    • acubed