Nihilism Philisophy Major

    snapdragon-lexer

    4.0.0 • Public • Published

    snapdragon-lexer NPM version NPM monthly downloads NPM total downloads Linux Build Status

    Converts a string into an array of tokens, with useful methods for looking ahead and behind, capturing, matching, et cetera.

    Please consider following this project's author, Jon Schlinkert, and consider starring the project to show your ❤️ and support.

    Table of Contents

    Details

    Install

    Install with npm:

    $ npm install --save snapdragon-lexer

    Breaking changes in v2.0!

    Please see the changelog for details!

    Usage

    const Lexer = require('snapdragon-lexer');
    const lexer = new Lexer();
     
    lexer.capture('slash', /^\//);
    lexer.capture('text', /^\w+/);
    lexer.capture('star', /^\*/);
     
    console.log(lexer.tokenize('foo/*'));

    API

    Lexer

    Create a new Lexer with the given options.

    Params

    • input {string|Object}: (optional) Input string or options. You can also set input directly on lexer.input after initializing.
    • options {object}

    Example

    const Lexer = require('snapdragon-lexer');
    const lexer = new Lexer('foo/bar');

    .bos

    Returns true if we are still at the beginning-of-string, and no part of the string has been consumed.

    • returns {boolean}

    .eos

    Returns true if lexer.string and lexer.queue are empty.

    • returns {boolean}

    .set

    Register a handler function.

    Params

    • type {string}
    • fn {function}: The handler function to register.

    Example

    lexer.set('star', function() {
      // do parser, lexer, or compiler stuff
    });

    .get

    Get a registered handler function.

    Params

    • type {string}
    • fn {function}: The handler function to register.

    Example

    lexer.set('star', function() {
      // do lexer stuff
    });
    const star = lexer.get('star');

    .has

    Returns true if the lexer has a registered handler of the given type.

    Params

    • {string}: type
    • returns {boolean}

    Example

    lexer.set('star', function() {});
    console.log(lexer.has('star')); // true

    .token

    Create a new Token with the given type and value.

    Params

    • type {string|Object}: (required) The type of token to create
    • value {string}: (optional) The captured string
    • match {array}: (optional) Match results from String.match() or RegExp.exec()
    • returns {Object}: Returns an instance of snapdragon-token

    Events

    • emits: token

    Example

    console.log(lexer.token({type: 'star', value: '*'}));
    console.log(lexer.token('star', '*'));
    console.log(lexer.token('star'));

    .isToken

    Returns true if the given value is a snapdragon-token instance.

    Params

    • token {object}
    • returns {boolean}

    Example

    const Token = require('snapdragon-token');
    lexer.isToken({}); // false
    lexer.isToken(new Token({type: 'star', value: '*'})); // true

    .consume

    Consume the given length from lexer.string. The consumed value is used to update lexer.state.consumed, as well as the current position.

    Params

    • len {number}
    • value {string}: Optionally pass the value being consumed.
    • returns {String}: Returns the consumed value

    Example

    lexer.consume(1);
    lexer.consume(1, '*');

    Returns a function for updating a token with lexer location information.

    • returns {function}

    .match

    Use the given regex to match a substring from lexer.string. Also validates the regex to ensure that it starts with ^ since matching should always be against the beginning of the string, and throws if the regex matches an empty string, which can cause catastrophic backtracking.

    Params

    • regex {regExp}: (required)
    • returns {Array|null}: Returns the match array from RegExp.exec or null.

    Example

    const lexer = new Lexer('foo/bar');
    const match = lexer.match(/^\w+/);
    console.log(match);
    //=> [ 'foo', index: 0, input: 'foo/bar' ]

    .scan

    Scan for a matching substring by calling .match() with the given regex. If a match is found, 1) a token of the specified type is created, 2) match[0] is used as token.value, and 3) the length of match[0] is sliced from lexer.string (by calling .consume()).

    Params

    • type {string}
    • regex {regExp}
    • returns {Object}: Returns a token if a match is found, otherwise undefined.

    Events

    • emits: scan

    Example

    lexer.string = '/foo/';
    console.log(lexer.scan(/^\//, 'slash'));
    //=> Token { type: 'slash', value: '/' }
    console.log(lexer.scan(/^\w+/, 'text'));
    //=> Token { type: 'text', value: 'foo' }
    console.log(lexer.scan(/^\//, 'slash'));
    //=> Token { type: 'slash', value: '/' }

    .capture

    Capture a token of the specified type using the provide regex for scanning and matching substrings. Automatically registers a handler when a function is passed as the last argument.

    Params

    • type {string}: (required) The type of token being captured.
    • regex {regExp}: (required) The regex for matching substrings.
    • fn {function}: (optional) If supplied, the function will be called on the token before pushing it onto lexer.tokens.
    • returns {Object}

    Example

    lexer.capture('text', /^\w+/);
    lexer.capture('text', /^\w+/, token => {
      if (token.value === 'foo') {
        // do stuff
      }
      return token;
    });

    .handle

    Calls handler type on lexer.string.

    Params

    • type {string}: The handler type to call on lexer.string
    • returns {Object}: Returns a token of the given type or undefined.

    Events

    • emits: handle

    Example

    const lexer = new Lexer('/a/b');
    lexer.capture('slash', /^\//);
    lexer.capture('text', /^\w+/);
    console.log(lexer.handle('text'));
    //=> undefined
    console.log(lexer.handle('slash'));
    //=> { type: 'slash', value: '/' }
    console.log(lexer.handle('text'));
    //=> { type: 'text', value: 'a' }

    .advance

    Get the next token by iterating over lexer.handlers and calling each handler on lexer.string until a handler returns a token. If no handlers return a token, an error is thrown with the substring that couldn't be lexed.

    • returns {Object}: Returns the first token returned by a handler, or the first character in the remaining string if options.mode is set to character.

    Example

    const token = lexer.advance();

    .lex

    Tokenizes a string and returns an array of tokens.

    Params

    • input {string}: The string to lex.
    • returns {Array}: Returns an array of tokens.

    Example

    let lexer = new Lexer({ handlers: otherLexer.handlers })
    lexer.capture('slash', /^\//);
    lexer.capture('text', /^\w+/);
    const tokens = lexer.lex('a/b/c');
    console.log(tokens);
    // Results in:
    // [ Token { type: 'text', value: 'a' },
    //   Token { type: 'slash', value: '/' },
    //   Token { type: 'text', value: 'b' },
    //   Token { type: 'slash', value: '/' },
    //   Token { type: 'text', value: 'c' } ]

    .enqueue

    Push a token onto the lexer.queue array.

    Params

    • token {object}
    • returns {Object}: Returns the given token with updated token.index.

    Example

    console.log(lexer.queue.length); // 0
    lexer.enqueue(new Token('star', '*'));
    console.log(lexer.queue.length); // 1

    .dequeue

    Shift a token from lexer.queue.

    • returns {Object}: Returns the given token with updated token.index.

    Example

    console.log(lexer.queue.length); // 1
    lexer.dequeue();
    console.log(lexer.queue.length); // 0

    .lookbehind

    Lookbehind n tokens.

    Params

    • n {number}
    • returns {Object}

    Example

    const token = lexer.lookbehind(2);

    .prev

    Get the previously lexed token.

    • returns {Object|undefined}: Returns a token or undefined.

    Example

    const token = lexer.prev();

    .lookahead

    Lookahead n tokens and return the last token. Pushes any intermediate tokens onto lexer.tokens. To lookahead a single token, use .peek().

    Params

    • n {number}
    • returns {Object}

    Example

    const token = lexer.lookahead(2);

    .peek

    Lookahead a single token.

    • returns {Object}: Returns a token.

    Example

    const token = lexer.peek();

    .next

    Get the next token, either from the queue or by advancing.

    • returns {Object|String}: Returns a token, or (when options.mode is set to character) either gets the next character from lexer.queue, or consumes the next charcter in the string.

    Example

    const token = lexer.next();

    .skip

    Skip n tokens or characters in the string. Skipped values are not enqueued.

    Params

    • n {number}
    • returns {Object}: returns an array of skipped tokens.

    Example

    const token = lexer.skip(1);

    .skipWhile

    Skip tokens while the given fn returns true.

    Params

    • fn {function}: Return true if a token should be skipped.
    • returns {Array}: Returns an array if skipped tokens.

    Example

    lexer.skipWhile(tok => tok.type !== 'space');

    .skipType

    Skip the given token types.

    Params

    • types {string|Array}: One or more token types to skip.
    • returns {Array}: Returns an array if skipped tokens.

    Example

    lexer.skipWhile(tok => tok.type !== 'space');

    .skipType

    Skip the given token types.

    Params

    • types {string|Array}: One or more token types to skip.
    • returns {Array}: Returns an array if skipped tokens

    Example

    lexer.skipType('space');
    lexer.skipType(['newline', 'space']);

    .push

    Pushes the given token onto lexer.tokens and calls .append() to push token.value onto lexer.stash. Disable pushing onto the stash by setting lexer.options.append or token.append to false.

    Params

    • token {object|String}
    • returns {Object}: Returns the given token.

    Events

    • emits: push

    Example

    console.log(lexer.tokens.length); // 0
    lexer.push(new Token('star', '*'));
    console.log(lexer.tokens.length); // 1
    console.log(lexer.stash) // ['*']

    .append

    Append a string to the last element on lexer.stash, or push the string onto the stash if no elements exist.

    Params

    • value {String}
    • returns {String}: Returns the last value in the array.

    Example

    const stack = new Stack();
    stack.push('a');
    stack.push('b');
    stack.push('c');
    stack.append('_foo');
    stack.append('_bar');
    console.log(stack);
    //=> Stack ['a', 'b', 'c_foo_bar']

    .isInside

    Returns true if a token with the given type is on the stack.

    Params

    • type {string}: The type to check for.
    • returns {boolean}

    Example

    if (lexer.isInside('bracket') || lexer.isInside('brace')) {
      // do stuff
    }

    .error

    Throw a formatted error message with details including the cursor position.

    Params

    • msg {string}: Message to use in the Error.
    • node {object}
    • returns {undefined}

    Example

    lexer.set('foo', function(tok) {
      if (tok.value !== 'foo') {
        throw this.state.error('expected token.value to be "foo"', tok);
      }
    });

    .use

    Call a plugin function on the lexer instance.

    Params

    • fn {function}
    • returns {object}: Returns the lexer instance.

    Example

    lexer.use(function(lexer) {
      // do stuff to lexer
    });

    Lexer#isLexer

    Static method that returns true if the given value is an instance of snapdragon-lexer.

    Params

    • lexer {object}
    • returns {Boolean}

    Example

    const Lexer = require('snapdragon-lexer');
    const lexer = new Lexer();
    console.log(Lexer.isLexer(lexer)); //=> true
    console.log(Lexer.isLexer({})); //=> false

    Lexer#isToken

    Static method that returns true if the given value is an instance of snapdragon-token. This is a proxy to Token#isToken.

    Params

    • lexer {object}
    • returns {Boolean}

    Example

    const Token = require('snapdragon-token');
    const Lexer = require('snapdragon-lexer');
    console.log(Lexer.isToken(new Token({type: 'foo'}))); //=> true
    console.log(Lexer.isToken({})); //=> false

    Lexer#State

    The State class, exposed as a static property.

    Lexer#Token

    The Token class, exposed as a static property.

    .set

    Register a handler function.

    Params

    • type {String}
    • fn {Function}: The handler function to register.

    Example

    lexer.set('star', function(token) {
      // do parser, lexer, or compiler stuff
    });

    As an alternative to .set, the .capture method will automatically register a handler when a function is passed as the last argument.

    .get

    Get a registered handler function.

    Params

    • type {String}
    • fn {Function}: The handler function to register.

    Example

    lexer.set('star', function() {
      // do parser, lexer, or compiler stuff
    });
    const star = handlers.get('star');

    Properties

    lexer.isLexer

    Type: {boolean}

    Default: true (contant)

    This property is defined as a convenience, to make it easy for plugins to check for an instance of Lexer.

    lexer.input

    Type: {string}

    Default: ''

    The unmodified source string provided by the user.

    lexer.string

    Type: {string}

    Default: ''

    The source string minus the part of the string that has already been consumed.

    lexer.consumed

    Type: {string}

    Default: ''

    The part of the source string that has been consumed.

    lexer.tokens

    Type: {array}

    Default: `[]

    Array of lexed tokens.

    lexer.stash

    Type: {array}

    Default: [''] (instance of snapdragon-stack)

    Array of captured strings. Similar to the lexer.tokens array, but stores strings instead of token objects.

    lexer.stack

    Type: {array}

    Default: `[]

    LIFO (last in, first out) array. A token is pushed onto the stack when an "opening" character or character sequence needs to be tracked. When the (matching) "closing" character or character sequence is encountered, the (opening) token is popped off of the stack.

    The stack is not used by any lexer methods, it's reserved for the user. Stacks are necessary for creating Abstract Syntax Trees (ASTs), but if you require this functionality it would be better to use a parser such as [snapdragon-parser][snapdragon-parser], with methods and other conveniences for creating an AST.

    lexer.queue

    Type: {array}

    Default: `[]

    FIFO (first in, first out) array, for temporarily storing tokens that are created when .lookahead() is called (or a method that calls .lookhead(), such as .peek()).

    Tokens are dequeued when .next() is called.

    lexer.loc

    Type: {Object}

    Default: { index: 0, column: 0, line: 1 }

    The updated source string location with the following properties.

    • index - 0-index
    • column - 0-index
    • line - 1-index

    The following plugins are available for automatically updating tokens with the location:

    Options

    options.source

    Type: {string}

    Default: undefined

    The source of the input string. This is typically a filename or file path, but can also be 'string' if a string or buffer is provided directly.

    If lexer.input is undefined, and options.source is a string, the lexer will attempt to set lexer.input by calling fs.readFileSync() on the value provided on options.source.

    options.mode

    Type: {string}

    Default: undefined

    If options.mode is character, instead of calling handlers (which match using regex) the .advance() method will consume and return one character at a time.

    options.value

    Type: {string}

    Default: undefined

    Specify the token property to use when the .push method pushes a value onto lexer.stash. The logic works something like this:

    lexer.append(token[lexer.options.value || 'value']);

    Tokens

    See the snapdragon-token documentation for more details.

    Plugins

    Plugins are registered with the lexer.use() method and use the following conventions.

    Plugin Conventions

    Plugins are functions that take an instance of snapdragon-lexer.

    However, it's recommended that you always wrap your plugin function in another function that takes an options object. This allow users to pass options when using the plugin. Even if your plugin doesn't take options, it's a best practice for users to always be able to use the same signature.

    Example

    function plugin(options) {
      return function(lexer) {
        // do stuff 
      };
    }
     
    lexer.use(plugin());

    About

    Contributing

    Pull requests and stars are always welcome. For bugs and feature requests, please create an issue.

    Please read the contributing guide for advice on opening issues, pull requests, and coding standards.

    Running Tests

    Running and reviewing unit tests is a great way to get familiarized with a library and its API. You can install dependencies and run tests with the following command:

    $ npm install && npm test
    Building docs

    (This project's readme.md is generated by verb, please don't edit the readme directly. Any changes to the readme must be made in the .verb.md readme template.)

    To generate the readme, run the following command:

    $ npm install -g verbose/verb#dev verb-generate-readme && verb

    Related projects

    You might also be interested in these projects:

    Author

    Jon Schlinkert

    License

    Copyright © 2018, Jon Schlinkert. Released under the MIT License.


    This file was generated by verb-generate-readme, v0.8.0, on November 19, 2018.

    Install

    npm i snapdragon-lexer

    DownloadsWeekly Downloads

    896

    Version

    4.0.0

    License

    MIT

    Unpacked Size

    49.1 kB

    Total Files

    8

    Last publish

    Collaborators

    • jonschlinkert