tokenize-whitespace

0.0.1 • Public • Published

tokenize-whitespace Build Status

Tokenize a string into words and whitespace tokens

Installation

$ npm install --save tokenize-whitespace

Usage

var tokenizeWhiteSpace = require('tokenize-whitespace');
 
var str = '\tString \nwith \nwhitespace \rchars';
 
// Get an Array of tokens
tokenizeWhitespace(str);

Output:

[ { text: '\t',         type: 'HORIZONTALTAB',  length: 1  },
  { text: 'String',     type: 'WORD',           length: 6  },
  { text: ' ',          type: 'SPACE',          length: 1  },
  { text: '\n',         type: 'LINEFEED',       length: 1  },
  { text: 'with',       type: 'WORD',           length: 4  },
  { text: ' ',          type: 'SPACE',          length: 1  },
  { text: '\n',         type: 'LINEFEED',       length: 1  },
  { text: 'whitespace', type: 'WORD',           length: 10 },
  { text: ' ',          type: 'SPACE',          length: 1  },
  { text: '\r',         type: 'CARRIAGERETURN', length: 1  },
  { text: 'chars',      type: 'WORD',           length: 5  } 
]

Relevant Whitespace

Currently, this module only tokenizes the following whitespace characters:

  • HORIZONTALTAB => '\t'
  • LINEFEED => '\n'
  • VERTICALTAB => '\v'
  • FORMFEED => '\f'
  • CARRIAGERETURN => '\r'
  • SPACE => ' '

Please open an issue or submit a pull request if you would like to see more whitespace character support.

License

MIT @ Michael Wuergler

Package Sidebar

Install

npm i tokenize-whitespace

Weekly Downloads

2,894

Version

0.0.1

License

MIT

Last publish

Collaborators

  • radiovisual