1.0.2 • Public • Published

String randomness score generator

A lightweight, 0 dependency package to generate a randomness score for a string. Used to identify if a string is gibberish or word-like. Some applications include -

  • Identify if a user is typing something or just banging the keyboard
  • Determine if a string is an API Key, Access Token, etc
  • Check if a string is something randomly generated by a computer


The tool returns back a randomness score for a string. You can tune the conditions according to your use case, but, generally, a score above 4 signifies that the input string is random.

NPM package

  • Install the npm package
  • Import and use it in your code like
const Model = require('./Model');

// Remember to load the model before using it

const score = Model.score("helloWorld");

How does it work?


  • At its core the model uses a bigram model to calculate the probability of the next character, given a character (Using a n-gram model would give better results, but its WIP).
  • We parse through a comprehensive list of words in the English language to create a 2D table which stores the occurrence of each character following the current character.
  • While generating this table, we also add a special <.> character at the start and end of each word to get the count of words starting & ending with a character. This table is then row-normalized to make the data uniform. This gives us the probability of a character following the current character. These probabilities are used in score calculation.

Score Calculation

  • We first parse the word to convert it to lowercase and remove any extra characters.
  • Then, since we have a bigram model, we break down the word into pairs of 2. ( including the special start and end <.> character )
  • Next, we get the log of the probability of this pair (As these probabilities are minute, their log is a better uniform measure)
  • We add these log values for all the pairs in the word.
  • As this sum is a negative number, we invert it to get a positive value.
  • We divide this score by the number of characters in the word to get the final score.


  • Create a fork and clone it.
  • To contribute to the model generation part, navigate to the modelGenerator/ folder . This contains a python notebook used for generating the model. Feel free to suggest improvements to the model
  • To contribute to the npm package, go into the modelGenerator/ directory which contains the source code for the npm package, as well as the latest model being used for calculation

Bug Reporting

Report your issues at https://github.com/Pranav2612000/string_randomness_score_generator/issues

Gotchas & Improvements

  • The model is trained on English words and may not work for other languages.
  • To reduce training complexity the model is case-insensitive.
  • The current model is not very accurate for very short strings.
  • The dataset the model is built on does not have first class support for numbers and some special characters, so strings involving these can be inaccurate.
  • The dataset does not include keyboard-common strings like "qwerty", so the results may not be correct for strings of these category.
  • The current model is a bigram. We can use Deep Learning to replace this with a n-gram model for better results.


  • Pranav Joglekar


This project is licensed under the terms of the MIT open source license. Please refer to LICENSE.md for the full terms.


Package Sidebar


npm i randomness-score-generator

Weekly Downloads






Unpacked Size

49.9 kB

Total Files


Last publish


  • pranav2612000