Nitrogen Pumpkin Mulch

    group-similar
    TypeScript icon, indicating that this package has built-in type declarations

    1.0.3 • Public • Published

    Group Similar

    Discord Twitter Follow

    Group similar items together.

    Runtime complexity is O(N^2 * (M + α(N))), where N is the number of elements in items, M is the runtime complexity of the similarityFunction, and α(N) is the inverse Ackermann function (amortized constant time for all practical purposes).

    Space complexity is O(N).

    Getting started

    npm i group-similar
    

    Examples

    Group similar strings
    import { groupSimilar } from "group-similar";
    import { distance } from "fastest-levenshtein";
    
    function levenshteinSimilarityFunction(a: string, b: string): number {
      return a.length === 0 && b.length === 0
        ? 1
        : 1 - distance(a, b) / Math.max(a.length, b.length);
    }
    
    groupSimilar({
      items: ["cat", "bat", "kitten", "dog", "sitting"],
      mapper: (i) => i,
      similarityFunction: levenshteinSimilarityFunction,
      similarityThreshold: 0.5,
    });
    
    // [ [ 'cat', 'bat' ], [ 'kitten', 'sitting' ], [ 'dog' ] ]
    Group similar numbers
    import { groupSimilar } from "group-similar";
    
    function evenOddSimilarityFunction(a: number, b: number): number {
      return Number(a % 2 === b % 2);
    }
    
    groupSimilar({
      items: [1, 5, 10, 0, 2, 123],
      mapper: (i) => i,
      similarityFunction: evenOddSimilarityFunction,
      similarityThreshold: 1,
    });
    
    // [ [ 1, 5, 123 ], [ 10, 0, 2 ] ]
    Group similar objects
    import { groupSimilar } from "group-similar";
    import { distance } from "fastest-levenshtein";
    
    function nestedMapper(object: { a: { b: { value: string } } }): string {
      return object.a.b.value;
    }
    
    function levenshteinSimilarityFunction(a: string, b: string): number {
      return a.length === 0 && b.length === 0
        ? 1
        : 1 - distance(a, b) / Math.max(a.length, b.length);
    }
    
    groupSimilar({
      items: [
        { a: { b: { value: "sitting" } } },
        { a: { b: { value: "dog" } } },
        { a: { b: { value: "kitten" } } },
        { a: { b: { value: "bat" } } },
        { a: { b: { value: "cat" } } },
      ],
      mapper: nestedMapper,
      similarityFunction: levenshteinSimilarityFunction,
      similarityThreshold: 0.5,
    });
    
    // [
    //   [{ a: { b: { value: "sitting" } } }, { a: { b: { value: "kitten" } } }],
    //   [{ a: { b: { value: "dog" } } }],
    //   [{ a: { b: { value: "bat" } } }, { a: { b: { value: "cat" } } }],
    // ]

    Syntax

    groupSimilar(options);
    groupSimilar({ items, mapper, similarityFunction, similarityThreshold });

    Parameters

    Parameter Type Required Default Description
    options Object Yes none Arguments to pass to the function

    Options

    Property Type Required Default Description
    items T[] Yes none Array of items to group
    mapper (t: T) => K Yes none Function to apply to each element in items prior to measuring similarity
    similarityFunction (a: K, b: K) => number Yes none Function to measure similarity between mapped items
    similarityThreshold number Yes none Threshold at which items whose similarity value is greater than or equal to it are grouped together

    Return value

    The return value is a new nested array of type T[][] containing elements of items grouped by similarity. If there are no elements in items, an empty array will be returned.

    Benchmark

    Benchmark test results where N is the number of items being grouped, higher ops/sec is better.

    Library N=16 N=32 N=64 N=128 N=256 N=512 N=1024 N=2048
    group-similar 86867 17538 6067 1594 444 171 75 27
    set-clustering 28506 6258 1831 455 121 30 6 1

    Benchmark configuration details can be found here.

    Install

    npm i group-similar

    DownloadsWeekly Downloads

    1

    Version

    1.0.3

    License

    MIT

    Unpacked Size

    10 kB

    Total Files

    5

    Last publish

    Collaborators

    • peterthehan