parse-entities
    TypeScript icon, indicating that this package has built-in type declarations

    4.0.0 • Public • Published

    parse-entities

    Build Coverage Downloads Size

    Parse HTML character references.

    Contents

    What is this?

    This is a small and powerful decoder of HTML character references (often called entities).

    When should I use this?

    You can use this for spec-compliant decoding of character references. It’s small and fast enough to do that well. You can also use this when making a linter, because there are different warnings emitted with reasons for why and positional info on where they happened.

    Install

    This package is ESM only. In Node.js (version 12.20+, 14.14+, or 16.0+), install with npm:

    npm install parse-entities

    In Deno with Skypack:

    import {parseEntities} from 'https://cdn.skypack.dev/parse-entities@3?dts'

    In browsers with Skypack:

    <script type="module">
      import {parseEntities} from 'https://cdn.skypack.dev/parse-entities@3?min'
    </script>

    Use

    import {parseEntities} from 'parse-entities'
    
    console.log(parseEntities('alpha &amp bravo')))
    // => alpha & bravo
    
    console.log(parseEntities('charlie &copycat; delta'))
    // => charlie ©cat; delta
    
    console.log(parseEntities('echo &copy; foxtrot &#8800; golf &#x1D306; hotel'))
    // => echo © foxtrot ≠ golf 𝌆 hotel

    API

    This package exports the following identifier: parseEntities. There is no default export.

    parseEntities(value[, options])

    Parse HTML character references.

    options

    Configuration (optional).

    options.additional

    Additional character to accept (string?, default: ''). This allows other characters, without error, when following an ampersand.

    options.attribute

    Whether to parse value as an attribute value (boolean?, default: false). This results in slightly different behavior.

    options.nonTerminated

    Whether to allow nonterminated references (boolean, default: true). For example, &copycat for ©cat. This behavior is compliant to the spec but can lead to unexpected results.

    options.position

    Starting position of value (Position or Point, optional). Useful when dealing with values nested in some sort of syntax tree. The default is:

    {line: 1, column: 1, offset: 0}
    options.warning

    Error handler (Function?).

    options.text

    Text handler (Function?).

    options.reference

    Reference handler (Function?).

    options.warningContext

    Context used when calling warning ('*', optional).

    options.textContext

    Context used when calling text ('*', optional).

    options.referenceContext

    Context used when calling reference ('*', optional)

    Returns

    string — decoded value.

    function warning(reason, point, code)

    Error handler.

    Parameters
    • this (*) — refers to warningContext when given to parseEntities
    • reason (string) — human readable reason for emitting a parse error
    • point (Point) — place where the error occurred
    • code (number) — machine readable code the error

    The following codes are used:

    Code Example Note
    1 foo &amp bar Missing semicolon (named)
    2 foo &#123 bar Missing semicolon (numeric)
    3 Foo &bar baz Empty (named)
    4 Foo &# Empty (numeric)
    5 Foo &bar; baz Unknown (named)
    6 Foo &#128; baz Disallowed reference
    7 Foo &#xD800; baz Prohibited: outside permissible unicode range

    function text(value, position)

    Text handler.

    Parameters
    • this (*) — refers to textContext when given to parseEntities
    • value (string) — string of content
    • position (Position) — place where value starts and ends

    function reference(value, position, source)

    Character reference handler.

    Parameters
    • this (*) — refers to referenceContext when given to parseEntities
    • value (string) — decoded character reference
    • position (Position) — place where source starts and ends
    • source (string) — raw source of character reference

    Types

    This package is fully typed with TypeScript. Additional Options, WarningHandler, ReferenceHandler, and TextHandler types are exported that model their respective values.

    Compatibility

    This package is at least compatible with all maintained versions of Node.js. As of now, that is Node.js 12.20+, 14.14+, and 16.0+. It also works in Deno and modern browsers.

    Security

    This package is safe: it matches the HTML spec to parse character references.

    Related

    Contribute

    Yes please! See How to Contribute to Open Source.

    License

    MIT © Titus Wormer

    Install

    npm i parse-entities

    DownloadsWeekly Downloads

    8,022,121

    Version

    4.0.0

    License

    MIT

    Unpacked Size

    25.9 kB

    Total Files

    7

    Last publish

    Collaborators

    • wooorm