A Node.js command-line tool to count LLM tokens in text files. It can process local files or read from stdin.
- Counts tokens using OpenAI's
tiktoken
library by default. - Supports input from file path or stdin.
- Basic framework for adding other tokenizers.
- Handles text files only.
- Clone this repository (or ensure you have the
tokencount.js
andpackage.json
files). - Navigate to the project directory in your terminal.
- Install dependencies:
npm install
- Make the script executable:
chmod +x tokencount.js
- Link the package to make the
tokencount
command available globally:npm link
Count tokens in a file:
tokencount /path/to/your/file.txt
Count tokens from piped input:
cat /path/to/your/file.txt | tokencount
Specify a tokenizer (default is openai-tiktoken
):
tokencount --tokenizer openai-tiktoken /path/to/your/file.txt
Currently, openai-tiktoken
is the primary supported tokenizer. A placeholder for gemini-text
exists but will use openai-tiktoken
as a fallback with a warning, as on-device Gemini tokenization is not yet implemented.
Get help:
tokencount --help
-
openai-tiktoken
: Uses thegpt2
encoding from OpenAI'stiktoken
library. -
gemini-text
(Placeholder): Currently falls back toopenai-tiktoken
. On-device support for Gemini tokenization is a future consideration pending available libraries.
- Text Files Only: This tool is designed for text files. Attempting to process binary files will result in an error or incorrect counts.
- On-Device Tokenization for Gemini: True on-device tokenization for Gemini models is not yet implemented.
To contribute or modify:
- The project uses ES Module syntax (
import
/export
). - The main script is
tokencount.js
. - Tokenizer logic is handled within the
.action(...)
callback intokencount.js
. - To add a new tokenizer, you would typically:
- Install any necessary Node.js package for that tokenizer.
- Import necessary functions from the package using
import
. - Add a new
else if
condition for your tokenizer's name intokencount.js
. - Implement the token counting logic within that block.
- Update this README and the help messages.