tcnt

1.0.0 • Public • Published

TokenCount CLI

A Node.js command-line tool to count LLM tokens in text files. It can process local files or read from stdin.

Features

  • Counts tokens using OpenAI's tiktoken library by default.
  • Supports input from file path or stdin.
  • Basic framework for adding other tokenizers.
  • Handles text files only.

Installation

  1. Clone this repository (or ensure you have the tokencount.js and package.json files).
  2. Navigate to the project directory in your terminal.
  3. Install dependencies:
    npm install
  4. Make the script executable:
    chmod +x tokencount.js
  5. Link the package to make the tokencount command available globally:
    npm link

Usage

Count tokens in a file:

tokencount /path/to/your/file.txt

Count tokens from piped input:

cat /path/to/your/file.txt | tokencount

Specify a tokenizer (default is openai-tiktoken):

tokencount --tokenizer openai-tiktoken /path/to/your/file.txt

Currently, openai-tiktoken is the primary supported tokenizer. A placeholder for gemini-text exists but will use openai-tiktoken as a fallback with a warning, as on-device Gemini tokenization is not yet implemented.

Get help:

tokencount --help

Supported Tokenizers

  • openai-tiktoken: Uses the gpt2 encoding from OpenAI's tiktoken library.
  • gemini-text (Placeholder): Currently falls back to openai-tiktoken. On-device support for Gemini tokenization is a future consideration pending available libraries.

Limitations

  • Text Files Only: This tool is designed for text files. Attempting to process binary files will result in an error or incorrect counts.
  • On-Device Tokenization for Gemini: True on-device tokenization for Gemini models is not yet implemented.

Development

To contribute or modify:

  • The project uses ES Module syntax (import/export).
  • The main script is tokencount.js.
  • Tokenizer logic is handled within the .action(...) callback in tokencount.js.
  • To add a new tokenizer, you would typically:
    1. Install any necessary Node.js package for that tokenizer.
    2. Import necessary functions from the package using import.
    3. Add a new else if condition for your tokenizer's name in tokencount.js.
    4. Implement the token counting logic within that block.
    5. Update this README and the help messages.

Readme

Keywords

Package Sidebar

Install

npm i tcnt

Weekly Downloads

2

Version

1.0.0

License

ISC

Unpacked Size

6.23 kB

Total Files

3

Last publish

Collaborators

  • kinlan