raw-yaml
Read anything as YAML
A maximally liberal YAML 1.2 parser:
- Will do its best to turn any string input into a YAML-ish AST representation
- Fully supports programmatic handling of YAML comments and multi-document streams
- Does practically no error checking, and should never throw -- but if you feed it garbage, you'll likely get a garbage AST as well
- Has no runtime dependencies and compresses to 11kB
- Minimises memory consumption by being really lazy; doesn't calculate node string values until you specifically ask for them
- Output object overrides
toString
at all levels to provide an idempotent YAML string representation - Allows (slightly clumsy) editing with a settable
Node#value
- Tested against all examples included in the YAML 1.2 spec
Usage & API
To install:
npm install raw-yaml
To use:
const str = `sequence: [ one, two, ]mapping: { sky: blue, sea: green }---- "flow in block"- > Block scalar- !!map # Block collection foo : bar` const ast = ast0 // first document, containing a map with two keys contents0 // document contents (as opposed to directives) items3node // the last item, a flow map items3 // the fourth token, parsed as a plain value strValue // 'blue' ast1 // second document, containing a sequence contents0 // document contents (as opposed to directives) items1node // the second item, a block value strValue // 'Block scalar\n'
parse(string): Array<Document>
The public API of the library is a single function which returns an array of parsed YAML documents (see below for details). The array and its contained nodes override the default toString
method, each returning a YAML representation of its contents.
If a node has its value
set, that will be used when re-stringifying. Care should be taken when modifying the AST, as no error checks are included to verify that the resulting YAML is valid, or that e.g. indentation levels aren't broken. In other words, this is an engineering tool and you may hurt yourself. If you're looking to generate a brand new YAML document, you should probably not be using this library directly.
If using the module in a CommonJS environment, the default-exported function is available at require('raw-yaml').default
.
For more usage examples and AST trees, take a look through the __tests__
directory.
AST Structure
For an example, here's what the first few lines of this file look like when parsed by raw-yaml (simplified for clarity):
directives: type: 'COMMENT' comment: ' raw-yaml' type: 'COMMENT' comment: '## Read anything as YAML' contents: type: 'MAP' items: type: 'PLAIN' strValue: 'A maximally liberal YAML 1.2...' type: 'MAP_VALUE' node: type: 'SEQ' items: type: 'SEQ_ITEM' node: strValue: 'Will do its...' type: 'SEQ_ITEM' node: strValue: 'Fully supports...'
Each node in the AST extends a common ancestor Node
(using flow-ish notation, so +
as a prefix indicates a read-only property:
start: number // offset of first character end: number // offset after last character +isEmpty: boolean // true if end is not greater than start context: atLineStart: boolean // is this node the first one on this line indent: number // current level of indentation (may be -1) src: string // the full original source error: ?Error // if not null, indicates a parser failure props: Array<Range> // anchors, tags and comments range: Range // span of context.src parsed into this node type: // specific node type 'ALIAS' | 'BLOCK_FOLDED' | 'BLOCK_LITERAL' | 'COMMENT' | 'DIRECTIVE' | 'DOCUMENT' | 'FLOW_MAP' | 'FLOW_SEQ' | 'MAP' | 'MAP_KEY' | 'MAP_VALUE' | 'PLAIN' | 'QUOTE_DOUBLE' | 'QUOTE_SINGLE' | 'SEQ' | 'SEQ_ITEM' value: ?string // if non-null, overrides source value +anchor: ?string // anchor, if set +comment: ?string // newline-delimited comment(s), if any +rawValue: ?string // an unprocessed slice of context.src // determining this node's value +tag: null | // this node's tag, if set verbatim: string | handle: string suffix: string : string // a YAML string representation of this node // rawValue will contain the anchor without the * prefix type: 'ALIAS' type: 'PLAIN' | 'QUOTE_DOUBLE' | 'QUOTE_SINGLE' | 'BLOCK_FOLDED' | 'BLOCK_LITERAL' +strValue: ?string // unescaped string value; may throw for // QUOTE_DOUBLE on bad escape sequences} type: 'COMMENT' // PLAIN nodes may also be comment-only +anchor: null +comment: string +rawValue: null +tag: null node: ContentNode | null type: 'MAP_KEY' | 'MAP_VALUE' // implicit keys are not wrapped items: Array<Comment | Alias | Scalar | MapItem> type: 'MAP' node: ContentNode | null type: 'SEQ_ITEM' items: Array<Comment | SeqItem> type: 'SEQ' type FlowChar = '{' | '}' | '[' | ']' | ',' | '?' | ':' items: Array<FlowChar | Comment | Alias | Scalar | FlowCollection> type: 'FLOW_MAP' | 'FLOW_SEQ' type ContentNode = Comment | Alias | Scalar | Map | Seq | FlowCollection name: string // for YAML 1.2 should be 'TAG' or 'YAML' type: 'DIRECTIVE' +anchor: null +parameters: Array<string> +tag: null directives: Array<Comment | Directive> contents: Array<ContentNode> type: 'DOCUMENT' +anchor: null +comment: null +tag: null
In actual code, MapItem
and SeqItem
are implemented as CollectionItem
, and correspondingly Map
and Seq
as Collection
. Due to parsing differences, each scalar type is implemented using its own class. Additional undocumented properties are available for Node
, but are likely only useful during parsing.