ahoq

1.0.2 • Public • Published

Aho Corasick JS

Aho Corasick pattern search algorithm that provides linear time search searching of bunch of patterns.

Time complexity: O(m) for preparation and O(n+r) for searching where

  • n - length of the searching text
  • m - the summ of length of all searching patterns
  • r - search result count

Requirements

  • NodeJS: 13.2+
  • NPM: 6.13+

Install

npm install ahoq

Docs

Package provides couple of classes:

  • AhoQ - main class
  • AhoQSearch - class returned from AhoQ::search public method and provides ability for advanced usage of searching algorithm described below.
  • AhoQSearchResult - search result item type
  • AhoQOptions - search options typeˇ

AhoQ


Main class used for pattern searching

constructor(patterns: string[])

Accepts the list of patterns as a string array and creates a AhoQ instance.

add(patterns: string[]): void

Accepts the list of patterns, adds new pattern list to the existing one and rebuild internal searching structure.

remove(patterns: string[]): void

Accepts the list of patterns, removes passed patterns from the existing lit and rebuild internal searching structure.

search(text: string, options: AhoQOptions): AhoQSearch

Main method to search the patterns in the text. It requires text to be passed as the first parameter, and options - optional parameter where you can set the search options.

It returns AhoQSearch instance that could be used to get the search results.

Example:

const ahoq = new AhoQ(['a', 'b', 'c']);
const ahoqSearch = ahoq.search('asdzxcqweb');
const results = [...ahoqSearch];

find(text: string): AhoQSearchResult[]

Another method to search the patterns in text. This method is much simpler. It accepts the text that will be searched, and it returns the array of search results.

Example:

const ahoq =  new  AhoQ(['a',  'b',  'c']);
const results = ahoq.find('asdzxcqweb');

AhoQSearch


Class for retrieving search results Constructing the object manually is not allowed. You can get the instance of AhoQSearch class by calling Ahoq::search method.

This class manage internal states of Aho Corasick data structure, and you can lazy search the patterns and manage the states and position in text which gives you a lot of flexibility and potential performance optimisation (depends on the tasks you want to perform)

Class implements iterator methods, so you can use for of and spread operator.

Notice: You can find couple of examples of using advanced searching in tests

getPos(): number

returns position in text searching is performing now

setPos(pos: number): void

accepts the number, that will be set as the new position in the text.

reset(): void

Resets the internal searching state, e.g. if the word stating to match, it will reset the current match.

reload(): void

Reloads the searching, e.g. resets the state and sets text position to 0.

next(): {value: AhoQSearchResult[], done: boolean}

Proceed the search and returns the next search results. This method used as an iterator next method.

AhoQOptions


Data class that represents search options

Contains one property: report: AhoQ.REPORT

There is 2 options in AhoQ.REPORT enum:

  • MATCH
  • STEP

MATCH used by default.

In case of MATCH report property used, the AhoQSearch::next method will return search results in case of any match found, and in case of STEP report value used, AhoQSearch::next will return on every character in the searching text.

AhoQSearchResult


Search result data class returned from AhoQ::find and AhoQSearch::next methods.

Class contain 2 properties:

  • pattern: String
  • index: number

representing the pattern found in the text (notice, pattern returned not as primitive type, but as a String object) and index in text where the pattern starts.

Example

import {AhoQ} from 'ahoq'

const ahoq = new AhoQ(['ab', 'aa', 'bc', 'abc', 'aaa', 'aab']);
let result = ahoq.find('aabbcaabc');

/**
 * result:
 * [
 *  {index:0, pattern: 'aa'},
 *  {index:0, pattern: 'aab'},
 *  {index:1, pattern: 'ab'},
 *  {index:3, pattern: 'bc'},
 *  {index:5, pattern: 'aa'},
 *  {index:5, pattern: 'aab'},
 *  {index:6, pattern: 'ab'},
 *  {index:6, pattern: 'abc'},
 *  {index:7, pattern: 'bc'}
 * ]
 */

ahoq.remove(['ab', 'bc', 'aaa', 'aa']);
const search = ahoq.search('asdasabcsdaabbbasd');

for (const it of search) {
	// it - contains next results
	// you can change position in the text, or reset the state
	// or whatever
}

Time complexity

  • O(n + r) - for searching, where n corresponds to the length of searching text, and r represents the number of found
  • O(m) - for preparing (building internal structures), where m represents the sum of all the pattern lengths used in search.

Internal structure rebuilds on constructing the AhoQ instance, and every time AhoQ::add and AhoQ::remove used.

License

MIT

Have fun! :)

Package Sidebar

Install

npm i ahoq

Weekly Downloads

5

Version

1.0.2

License

MIT

Unpacked Size

19.4 kB

Total Files

9

Last publish

Collaborators

  • immortale