Search results

470 packages found

Fast uint8array to utf-8 codepoint iterator for streams and array buffers by @okikio & @jonathantneal

published 1.1.1 7 months ago
M
Q
P
M
Q
P

Lexer / tokenizer

published 0.6.0 2 years ago
M
Q
P

A simple generic string tokenizer

published 1.0.3 5 months ago
M
Q
P

Forked version of kuromoji with better compatibility for browsers

published 1.1.0 a year ago
M
Q
P
M
Q
P

Forked version of kuromoji with better compatibility for browsers

published 1.0.2 6 months ago
M
Q
P

TS tokenizer for Mistral-based LLMs

published 1.2.2 5 months ago
M
Q
P

Tokenizing strings of text. Extracting arrays of words and optionally number, emojis, tags, usernames and email addresses from strings. For Node.js and the browser. When you need more than just [a-z] regular expressions.

published 9.1.2 10 months ago
M
Q
P

Transforms a string into a list of tokens

published 1.1.6 6 months ago
M
Q
P
M
Q
P
M
Q
P
M
Q
P

Textmate token-based language service for Visual Studio Code.

published 3.0.1 7 months ago
M
Q
P
M
Q
P
M
Q
P

llama2 tokenizer for NodeJS/Browser

published 1.1.2 11 days ago
M
Q
P

Node based tokenizers for open source models hosted on HuggingFace.

published 2.0.1 6 months ago
M
Q
P

gpt4 tokenizer for NodeJS/Browser

published 1.1.2 11 days ago
M
Q
P

command_r_plus tokenizer for NodeJS/Browser

published 1.1.2 11 days ago
M
Q
P