Search results
470 packages found
Fast uint8array to utf-8 codepoint iterator for streams and array buffers by @okikio & @jonathantneal
- NLP
- PanGuSegment
- PoS tagging
- analyzer
- async
- chinese
- chinese segmentation
- data
- dict
- dictionary
- file
- hanzi
- jieba
- load
- View more
A simple generic string tokenizer
React typeahead with Bootstrap styling
- auto complete
- auto suggest
- auto-complete
- auto-suggest
- autocomplete
- autosuggest
- bootstrap
- bootstrap tokenizer
- bootstrap typeahead
- bootstrap-tokenizer
- bootstrap-typeahead
- react
- react autocomplete
- react autosuggest
- View more
Tokenizing strings of text. Extracting arrays of words and optionally number, emojis, tags, usernames and email addresses from strings. For Node.js and the browser. When you need more than just [a-z] regular expressions.
Forked version of kuromoji with better compatibility for browsers
TS tokenizer for Mistral-based LLMs
Transforms a string into a list of tokens
- NLP
- PanGuSegment
- PoS tagging
- analyzer
- async
- chinese
- chinese segmentation
- data
- dict
- dictionary
- file
- hanzi
- jieba
- load
- View more
原版 node-segment 的格式
- NLP
- PanGuSegment
- PoS tagging
- analyzer
- async
- chinese
- chinese segmentation
- data
- dict
- dictionary
- file
- hanzi
- jieba
- load
- View more
- NLP
- PanGuSegment
- PoS tagging
- analyzer
- async
- chinese
- chinese segmentation
- data
- dict
- dictionary
- file
- hanzi
- jieba
- load
- View more
- NLP
- PanGuSegment
- PoS tagging
- analyzer
- async
- chinese
- chinese segmentation
- data
- dict
- dictionary
- file
- hanzi
- jieba
- load
- View more
- NLP
- PanGuSegment
- PoS tagging
- analyzer
- async
- chinese
- chinese segmentation
- data
- dict
- dictionary
- file
- hanzi
- jieba
- load
- View more
llama2 tokenizer for NodeJS/Browser
Node based tokenizers for open source models hosted on HuggingFace.
Textmate token-based language service for Visual Studio Code.
- vscode
- vscode-extension
- textmate
- grammar
- language-features
- language-service
- language
- lsp
- parse
- syntax
- tokenization
- tokenizer
gpt4 tokenizer for NodeJS/Browser
command_r_plus tokenizer for NodeJS/Browser
- NLP
- PanGuSegment
- PoS tagging
- analyzer
- async
- chinese
- chinese segmentation
- data
- dict
- dictionary
- file
- hanzi
- jieba
- load
- View more