Nutrias Punching Marmots

    snowball-stem
    TypeScript icon, indicating that this package has built-in type declarations

    1.1.1 • Public • Published

    snowball

    Snowball stemmers for deno. These stemmers are based on the compiled JavaScript stemmers from the snowball project version 2.2.0.

    Usage

    EnglishStemmer

    Provides the stem of the given word. Assumes that the input is lowercase.

    import { assertStrictEquals } from "./test_deps.ts";
    import { EnglishStemmer } from "https://deno.land/x/snowball/english_stemmer.ts";
    
    const englishStemmer = new EnglishStemmer();
    
    const stem = englishStemmer.stem("enthusiastically");
    
    assertStrictEquals(stem, "enthusiast");

    Here is an example with multiple words.

    import { assertStrictEquals } from "./test_deps.ts";
    import { EnglishStemmer } from "https://deno.land/x/snowball/english_stemmer.ts";
    
    const sentence = "the quick brown fox jumped over the lazy dog";
    
    const englishStemmer = new EnglishStemmer();
    
    const stemmedSentence = sentence
      .match(/\b\w\w+\b/gu) // matches two or more word characters
      .map((token) => englishStemmer.stem(token))
      .join(" ");
    
    assertStrictEquals(
      stemmedSentence,
      "the quick brown fox jump over the lazi dog",
    );

    RussianStemmer

    Many languages are supported

    import { assertStrictEquals } from "./test_deps.ts";
    import { RussianStemmer } from "https://deno.land/x/snowball/russian_stemmer.ts";
    
    const sentence = "обязательно выпейте свой овалтин";
    
    const russianStemmer = new RussianStemmer();
    
    const stemmedSentence = sentence
      .split(/\s+/u)
      .map((token) => russianStemmer.stem(token))
      .join(" ");
    
    assertStrictEquals(
      stemmedSentence,
      "обязательн вып сво овалтин",
    );

    LanguageStemmers

    There is an Object containing all available languages and stemmers defined in mod.ts.

    import { assertStrictEquals } from "https://deno.land/std@0.126.0/testing/asserts.ts";
    import { LanguageStemmers } from "https://deno.land/x/snowball/mod.ts";
    
    const spanishStemmer = new LanguageStemmers["Spanish"]();
    
    assertStrictEquals(
      spanishStemmer.stem("gracias"),
      "graci",
    );

    stopWords

    Stop words generally provide little or no information. Some languages come with common stop words from the snowball-website. These can also be imported individually from the file containing the stopwords for that language. You should tokenize and stem these stop words the same way you do with your input text.

    import { assert } from "https://deno.land/std@0.126.0/testing/asserts.ts";
    import { EnglishStemmer } from "https://deno.land/x/snowball/english_stemmer.ts";
    
    const englishStemmer = new EnglishStemmer();
    
    assert(englishStemmer.stopWords.has("been"));

    Supported Languages

    Unless specified, there is only one stemmer available called LanguageStemmer which is exported from language_stemmer.ts and mod.ts. Replace language with the desired language respecting the hinted capitalization.

    1. Arabic
    2. Armenian
    3. Basque
    4. Catalan
    5. Danish
    6. Dutch
      1. DutchStemmer
      2. KraaijPohlmannStemmer
    7. English
      1. EnglishStemmer - Porter 2 or snowball algorithm
      2. PorterStemmer - Porter 1 stemmer
      3. LovinsStemmer - The first published stemming algorithm
    8. Finnish
    9. French
    10. German
      1. GermanStemmer
      2. German2Stemmer
    11. Greek
    12. Hindi
    13. Hungarian
    14. Indonesian
    15. Irish
    16. Italian
    17. Lithuanian
    18. Nepali
    19. Norwegian
    20. Protugese
    21. Romanian
    22. Russian
    23. Serbian
    24. Spanish
    25. Swedish
    26. Tamil
    27. Turkish
    28. Yiddish

    Install

    npm i snowball-stem

    DownloadsWeekly Downloads

    14

    Version

    1.1.1

    License

    BSD 3-Clause

    Unpacked Size

    1.91 MB

    Total Files

    107

    Last publish

    Collaborators

    • skookumchoocher