Search results
460 packages found
A class Tokenizer to convert text documents into sequences of tokens
Wix Restaurants credit-cards tokenizer
TS tokenizer for Mistral-based LLMs
stream-json is the micro-library of Node.js stream components for creating custom JSON processing pipelines with a minimal memory footprint. It can parse JSON files far exceeding available memory streaming individual primitives using a SAX-inspired API. I
JavaScript implementation of Japanese morphological analyzer
A React Native supported JavaScript implementation of Japanese morphological analyzer
Fastly VCL tokenizer
Isomorphic utilities for GPT-3 tokenization and prompt building.
Tokenizes a string that represents a regular expression.
Simple algorithm to tokenize Chinese texts into words using CC-CEDICT.
A CLI tool to concatenate all text files in your CWD with headers for GPT prompt engineering.
- text
- concatenate
- CLI
- Current Working Directory
- GPT
- tokens
- tokenizer
- ChatGPT
- prompt engineering
- token-count
- text manipulation
- file concatenation
- command line interface
- GPT-3
- View more
Parse PHP code from JS and returns its AST
TS tokenizer for Mistral-based LLMs
Simple synchronous string tokenizer using Regex
Simple, but powerful lexical scanner that is a more minimal implementation of X-Scanner
- y-scanner
- yscanner
- x-scanner
- xscanner
- stringscanner
- scanner
- string
- text
- textscanner
- lex
- lexer
- lexical
- parse
- parser
- View more
A streaming JSON tokenizer
A tokenizer for Google-like search queries
A small ECMAScript parser, tokenizer and minifier written in JavaScript.
Time a JavaScript tokenizer