Nocturnal Programmer's Machine

    This package has been deprecated

    Author message:

    use "html-rewriter-wasm" instead

    @mrbbot/parsed-html-rewriter
    TypeScript icon, indicating that this package has built-in type declarations

    0.1.8 • Public • Published

    This is a fork of @worker-tools/parsed-html-rewriter that implements the onDocument handler and fixes installation on Windows.

    Parsed HTML Rewriter

    A DOM-based implementation of Cloudflare Worker's HTMLRewriter.

    Unlike the original, this implementation parses the entire DOM (provided by linkedom), and runs selectors against this representation. As a result, it is slower, more memory intensive, and can't process streaming data.

    Note that this approach was chosen to quickly implement the functionality of HTMLRewriter, as there is currently no JS implementation available. A better implementation would replicate the streaming approach of lol-html, or even use a WebAssembly version of it.

    However, this implementation should run in most JS contexts (including Web Workers, Service Workers and Deno) without modification and handle many, if not most, use cases of HTMLRewriter. It should be good enough for testing and offline Workers development.

    Usage

    This module can be used in two ways.

    As a standalone module:

    import { ParsedHTMLRewriter } from '@mrbbot/parsed-html-rewriter'
    
    await new ParsedHTMLRewriter()
      .transform(new Response('<body></body>'))
      .text();

    Or as a polyfill:

    import '@mrbbot/parsed-html-rewriter/polyfill'
    
    await new HTMLRewriter() // Will use the native version when running in a Worker
      .transform(new Response('<body></body>'))
      .text();

    innerHTML

    Unlike the current (March 2021) version on CF Workers, this implementation already supports the proposed innerHTML handler. Note that this feature is unstable and will likely change as the real version materializes.

    await new HTMLRewriter()
      .on('body', {
        innerHTML(html) {
          console.log(html) // => '<div id="foo">bar</div>'
        },
      })
      .transform(new Response('<body><div id="foo">bar</div></body>'))
      .text();

    Caveats

    • Because this version isn't based on streaming data, the order in which handlers are called can differ. Some measure have been taken to simulate the order, but differences may occur.
    • Texts never arrive in chunks. There is always just one chunk, followed by an empty one with lastInTextNode set to true.

    Install

    npm i @mrbbot/parsed-html-rewriter

    DownloadsWeekly Downloads

    9

    Version

    0.1.8

    License

    MIT

    Unpacked Size

    405 kB

    Total Files

    25

    Last publish

    Collaborators

    • mrbbot