@pdfnav/smart-text-search
TypeScript icon, indicating that this package has built-in type declarations

1.0.10 • Public • Published

Smart Text Search

Search some text and receive a search score using configurable parameters.

Usage

import { search, SearchOptions } from "@pdfnav/smart-text-search";
const query = "quick fox";
const body = "The quick brown fox jumps over the lazy dog";

// Receive a score
const searchScore = search({ query, body });

// Provide custom search options tailored to your usecase
const options: SearchOptions = {
  exactMatchPoints: 70,
  exactContainingMatchPoints: 50,
  singleWordMatchPoints: 1,
  singleWordMatchLengthMultiplier: 1.5,
  uniqueSingleWordMatchPoints: 2,
  uniqueSingleWordMatchLengthMultiplier: 1.5,
  consecutiveWordMatchPoints: 5,
  consecutiveWordMatchLengthMultiplier: 2,
  consecutiveWordSequenceMatchPoints: 1,
  consecutiveWordSequenceLengthMultiplier: 1.5,
  isCaseSensitive: false,
  shouldMatchPunctuation: true,
  shouldMatchWhitespaceAndPunctuation: true,
  shouldCollapseWhitespace: true,
};
const customisedSearchScore = search({ query, body, options });

Search options

  • exactMatchPoints (number): points to apply upon finding an exact match with body
  • exactContainingMatchPoints (number): points to apply upon finding an exact match within body
  • singleWordMatchPoints (number): points to apply upon finding a single word within query matching with a word in body
  • singleWordMatchLengthMultiplier (number): makes single word matches more significant if they are longer
  • uniqueSingleWordMatchPoints (number): points to apply upon finding a single word within query matching with a word in body, only count unique words
  • uniqueSingleWordMatchLengthMultiplier (number): makes single word matches more significant if they are longer, only count unique words
  • consecutiveWordMatchPoints (number): points to apply when two or more consecutive words in query match
  • consecutiveWordMatchLengthMultiplier (number): makes every consecutive matching word more significant
  • consecutiveWordSequenceMatchPoints (number): points to apply when two or more consecutive characters in query match
  • consecutiveWordSequenceLengthMultiplier (number): makes longer matching sequences more significant
  • isCaseSensitive (boolean): make search case sensitive
  • shouldMatchPunctuation (boolean): make punctuation not count as a match (removes all punctuation from string)
  • shouldMatchWhitespaceAndPunctuation (boolean): make whitespace sequences and punctuation contribute to score
  • shouldCollapseWhitespace (boolean): replace consecutive whitespace with a single space

Package Sidebar

Install

npm i @pdfnav/smart-text-search

Homepage

pdfnav.com

Weekly Downloads

5

Version

1.0.10

License

ISC

Unpacked Size

20.9 kB

Total Files

5

Last publish

Collaborators

  • dcs77