Search results

6 packages found

Additional tokenizers for Orama

published version 3.1.6, 4 days ago0 dependents licensed under $Apache-2.0
1,583

TypeScript version of PGN Tokenizer, a Byte Pair Encoding (BPE) tokenizer for Chess Portable Game Notiation (PGN).

published version 0.1.6, 22 days ago0 dependents licensed under $MIT
92

OpenVINO™ Tokenizers adds text processing operations to openvino-node package

published version 2025.1.0, 9 days ago1 dependents licensed under $Apache-2.0
63

cpp tokenizer module for fibjs.

published version 1.2.1, 9 months ago0 dependents licensed under $MIT
25

This repository holds the code for the TokenGeeX Rust crate and Python package. TokenGeeX is a tokenizer for [CodeGeeX](https://github.com/THUDM/Codegeex2) aimed at code and Chinese. It is based on [UnigramLM (Taku Kudo 2018)](https://arxiv.org/abs/1804.1

published version 0.6.2, a year ago0 dependents licensed under $ISC
10

Port of HuggingFace's tokenizers using Expo Modules for React Native Apps

published version 0.1.0, 5 months ago0 dependents licensed under $MIT
11