National Palace Museum

    trigrams
    TypeScript icon, indicating that this package has built-in type declarations

    5.0.0 • Public • Published

    trigrams

    Build Coverage Downloads

    Trigrams for 400+ languages.

    Install

    This package is ESM only: Node 12+ is needed to use it and it must be imported instead of required.

    npm:

    npm install trigrams

    API

    This package exports the following identifiers: top, min. There is no default export.

    top()

    import {top} from 'trigrams'
    
    console.log((await top()).pam)

    Yields:

    {
      'isa': 6,
      'upa': 6,
      'i k': 6,
      // …
      'ang': 273,
      'ing': 282,
      'ng ': 572
    }

    Returns a promise resolving to an object mapping UDHR in Unicode codes to objects mapping the top 300 trigrams to occurrence counts.

    min()

    import {min} from 'trigrams'
    
    console.log((await min()).nld)

    Yields:

    [
      ' ar',
      'eer',
      'tij',
      // …
      'de ',
      'an ',
      'en '
    ]

    A bit like top, but returns a promise resolving to arrays containing the top 300 trigrams sorted from least occurring to most occurring.

    Data

    The trigrams are based on the unicode versions of the universal declaration of human rights.

    The files are created from all paragraphs made available by wooorm/udhr and do not include headings and such.

    Cleaning

    Before creating trigrams,

    • The unicode characters from \u0021 to \u0040 (both including) are removed
    • One or more white space characters (\s+) are replaced with a single space
    • Alphabetic characters are lower cased ([A-Z])

    Additionally, the input is padded with two spaces on both sides.

    Support

    Code Name OHCHR
    007 Sãotomense 1128
    008 Crioulo, Upper Guinea (008) No
    009 Mbundu (009) No
    010 Tetun Dili No
    011 Umbundu (011) No
    012 (Bizisa) bz1
    013 (Mijisa) bz2
    014 (Maiunan) ma1
    016 (Minjiang, spoken) mi1_spok
    017 (Minjiang, written) mi1_written
    020 Drung ty1
    026 (Yeonbyeon) ye1
    aar Afar aar
    abk Abkhaz abk
    ace Aceh atj
    acu Achuar-Shiwiar acu
    acu_1 Achuar-Shiwiar (1) jiv
    ada Dangme gac1
    ady Adyghe ady
    afr Afrikaans afk
    agr Aguaruna agr
    aii Assyrian Neo-Aramaic aii
    ajg Aja ajg
    aka_akuapem Twi (Akuapem) tws1
    aka_asante Twi (Asante) ass
    aka_fante Fante tws3
    als Albanian, Tosk aln
    alt Altai, Southern alt
    amc Amahuaca amc
    ame Yaneshaʼ ame
    amh Amharic amh
    ami Amis ami
    amr Amarakaeri amr
    arb Arabic, Standard arz
    arl Arabela arl
    arn Mapudungun aru
    ast Asturian aub
    auc Waorani 1127
    auv Occitan (Auvergnat) auv1
    ayr Aymara, Central aym
    azj_cyrl Azerbaijani, North (Cyrillic) azb1
    azj_latn Azerbaijani, North (Latin) azb
    bam Bamanankan bra
    ban Bali bzc
    bax Bamun bax
    bba Baatonum bba
    bci Baoulé bci
    bcl Bicolano, Central bkl
    bel Belarusan ruw
    bem Bemba bem
    ben Bengali bng
    bfa Bari bfa
    bho Bhojpuri bhj
    bin Edo edo
    bis Bislama bcy
    blt Tai Dam No
    blu Hmong Njua blu
    boa Bora boa
    bod Tibetan, Central tic
    bos_cyrl Bosnian (Cyrillic) src4
    bos_latn Bosnian (Latin) src1
    bre Breton brt
    btb Bulu btb
    buc Bushi buc
    bug Bugis bpr
    bul Bulgarian blg
    cab Garifuna cab
    cak Kaqchikel, Central cak1
    cat Catalan-Valencian-Balear cln
    cbi Chachi 1122
    cbr Cashibo-Cacataibo cbr
    cbs Cashinahua cbs
    cbt Chayahuita cbt
    cbu Candoshi-Shapra cbu
    ccx Zhuang, Yongbei ccx
    ceb Cebuano ceb
    ces Czech czc
    cha Chamorro cjd
    chj Chinantec, Ojitlán chj
    chk Chuukese tru1
    chr_cased Cherokee (cased) No
    chr_uppercase Cherokee (uppercase) No
    cic Chickasaw cic
    cjk Chokwe cjk
    cjk_AO Chokwe (Angola) cjk
    cjs Shor cjs
    ckb Kurdish, Central kdb1
    cnh Chin, Haka hak
    cni Asháninka cni
    cof Colorado cof
    cos Corsican coi
    cot Caquinte cot
    cpu Ashéninka, Pichis cpu
    crh Crimean Tatar crh
    crs Seselwa Creole French crs
    csa Chinantec, Chiltepec csa
    csw Cree, Swampy crm
    ctd Chin, Tedim tid
    cym Welsh wls
    dag Dagbani dag
    dan Danish dns
    ddn Dendi den
    deu_1901 German, Standard (1901) ger
    deu_1996 German, Standard (1996) No
    dga Dagaare, Southern dga
    dip Dinka, Northeastern dinka
    div Maldivian div
    dyo Jola-Fonyi dyo
    dyu Jula dyu
    dzo Dzongkha dzo
    ell_monotonic Greek (monotonic) grk
    ell_polytonic Greek (polytonic) No
    emk Maninkakan, Eastern mni
    eml Romagnolo eml
    eng English eng
    epo Esperanto 1115
    ese Ese Ejja ese
    est Estonian est
    eus Basque bsq
    eve Even eve
    evn Evenki evn
    ewe Éwé ewe
    fao Faroese fae
    fij Fijian fji
    fin Finnish fin
    fkv Finnish, Kven fkv
    flm Chin, Falam fal
    fon Fon foa
    fra French frn
    fri Frisian, Western fri
    fuf Pular fuf
    fur Friulian frl
    fuv Fulfulde, Nigerian fum
    fuv2 Fulfulde, Nigerian (2) fuv
    gaa Ga gac2
    gag Gagauz gag
    gax Oromo, Borana-Arsi-Guji gax
    gjn Gonja dum
    gkp Kpelle, Guinea pke
    gla Gaelic, Scottish gls
    gld Nanai gld
    gle Gaelic, Irish gli1
    glg Galician gln
    glv Manx No
    gsw1 Alemannisch (Elsassisch) gsw
    guc Wayuu guc
    gug Guaraní, Paraguayan gun
    guj Gujarati gjr
    guu Yanomamö guu
    gyr Guarayu gua
    hat_kreyol Haitian Creole French (Kreyol) hat
    hat_popular Haitian Creole French (Popular) hat1
    hau_NE Hausa (Niger) gej
    hau_NG Hausa (Nigeria) gej
    haw Hawaiian hwi
    hea Hmong, Northern Qiandong hea
    heb Hebrew hbr
    hil Hiligaynon hil
    hin Hindi hnd
    hlt Chin, Matu hlt
    hms Hmong, Southern Qiandong hms
    hna Mina hna
    hni Hani hni
    hns Hindustani, Sarnami hns
    hrv Croatian src2
    hsb Sorbian, Upper wee
    hsf Huastec (Sierra de Otontepec) hus
    hun Hungarian hng
    hus Huastec (Veracruz) 1118
    huu Huitoto, Murui huu
    hva Huastec (San Luís Potosí) hva
    hye Armenian arm
    ibb Ibibio ibb
    ibo Igbo igr
    ido Ido 1120
    ike Inuktitut, Eastern Canadian esb
    ilo Ilocano ilo
    ina Interlingua 1119
    ind Indonesian inz
    isl Icelandic ice
    ita Italian itn
    jav Javanese (Latin) jan
    jav_java Javanese (Javanese) No
    jiv Shuar 1125
    jpn Japanese jpn
    kal Inuktitut, Greenlandic esg
    kan Kannada kjv
    kat Georgian geo
    kaz Kazakh kaz
    kbd Kabardian kbd
    kbp Kabiyé kbp
    kde Makonde kde
    kdh Tem kdh
    kea Kabuverdianu kea
    kek Q'eqchi' 1116
    kha Khasi kha
    khk Mongolian, Halh (Cyrillic) khk
    khm Khmer, Central khm
    kin Rwanda rua1
    kir Kirghiz kdo
    kjh Khakas kjh
    kkh_lana Khün No
    kmb Mbundu mlo
    kmr Kurdish, Northern kur
    knc Kanuri, Central kph
    kng Koongo kon
    kng_AO Koongo (Angola) kng
    koi Komi-Permyak koi
    koo Konjo koo1
    kor Korean kkn
    kqn Kaonde kqn
    kqs Kissi, Northern kqs
    kri Krio kri
    krl Karelian krl
    ktu Kituba ktu
    kwi Awa-Cuaiquer kwi
    lad Ladino lad
    lao Lao nol
    lat Latin ltn
    lat_1 Latin (1) ltn1
    lav Latvian lat
    lia Limba, West-Central lia
    lij Ligurian lij
    lin Lingala lin
    lin_tones Lingala (tones) No
    lit Lithuanian lit
    lld Ladin lld
    lnc Occitan (Languedocien) prv1
    lns Lamnso' nso
    lob Lobi lob
    lot Otuho lot
    loz Lozi lbm1
    ltz Luxembourgeois lux
    lua Luba-Kasai lub
    lue Luvale lue
    lug Ganda lap1
    lun Lunda mlo1
    lus Mizo lus
    mad Madura mhj
    mag Magahi mqm
    mah Marshallese mzm
    mai Maithili No
    mal Malayalam mjs
    mal_chillus Malayalam mjs
    mam Mam, Northern mam
    mar Marathi mrt
    maz Mazahua Central maz
    mcd Sharanahua mcd
    mcf Matsés mcf
    men Mende mfy
    mfq Moba mfq
    mic Micmac mic
    min Minangkabau mpu
    miq Mískito miq
    mkd Macedonian mkj
    mlt Maltese mls
    mly_arab Malay (Arabic) No
    mly_latn Malay (Latin) mli
    mnw Mon No
    mos Mòoré mhm
    mri Maori mbf
    mto Mixe, Totontepec mto
    mxi Mozarabic moz
    mxv Mixtec, Metlatónoc mxv
    mya Burmese bms
    mzi Mazatec, Ixcatlán mao
    nav Navajo nav
    nba Nyemba nba
    nbl Ndebele nel
    ndo Ndonga 1114
    nds Saxon, Low ige
    nep Nepali nep
    nhn Nahuatl, Central nhn
    nio Nganasan nio
    niu Niue niu
    njo Naga, Ao njo
    nku Kulango, Bouna kou
    nld Dutch dut
    nno Norwegian, Nynorsk nrn
    nob Norwegian, Bokmål nrr
    not Nomatsiguenga not
    nso Sotho, Northern srt
    nya_chechewa Nyanja (Chechewa) nyj1
    nya_chinyanja Nyanja (Chinyanja) nyj
    nym Nyamwezi nyz
    nyn Nyankore nyn1
    nzi Nzema nze
    oaa Orok oaa
    oci_1 Occitan (Francoprovençal, Fribourg) Fr3
    oci_2 Occitan (Francoprovençal, Savoie) fr2
    oci_3 Occitan (Francoprovençal, Vaud) fr4
    oci_4 Occitan (Francoprovençal, Valais) frp
    ojb Ojibwa, Northwestern ojb
    oki Okiek oki
    orh Oroqen orh
    oss Osetin ose
    ote Otomi, Mezquital 1111
    pam Pampangan pmp
    pan Panjabi, Eastern pnj1
    pap Papiamentu pap
    pau Palauan plu
    pbb Páez pbb
    pbu Pashto, Northern pbu
    pcd Picard frn2
    pcm Pidgin, Nigerian pcm
    pes_1 Farsi, Western prs
    pes_2 Dari prs1
    pis Pijin pis
    piu Pintupi-Luritja piu
    plt Malagasy, Plateau mex
    pnb Panjabi, Western No
    pol Polish pql
    pon Pohnpeian pnf
    por_BR Portuguese (Brazil) No
    por_PT Portuguese (Portugal) por
    pov Crioulo, Upper Guinea gbc
    ppl Pipil ppl
    prv Occitan pro
    quc K'iche', Central 1117
    qud Quechua (Unified Quichua, old Hispanic orthography) qud1
    qug Quichua, Chimborazo Highland qug
    quy Quechua, Ayacucho quy
    quz Quechua, Cusco quz
    qva Quechua, Ambo-Pasco qeg
    qvc Quechua, Cajamarca qnt
    qvh Quechua, Huamalíes-Dos de Mayo Huánuco qej
    qvm Quechua, Margos-Yarowilca-Lauricocha qei
    qvn Quechua, North Junín qju
    qwh Quechua, Huaylas Ancash qan
    qxa Quechua, South Bolivian qec1
    qxn Quechua, Northern Conchucos Ancash qed
    qxu Quechua, Arequipa-La Unión qar
    rar Rarotongan rrt
    rmn Romani, Balkan rmn
    rmn_1 Romani, Balkan (1) rmn1
    rmy Aromanian rmy1
    roh Romansch No
    roh_puter Romansch (Puter) No
    roh_rumgr Romansch (Grischun) rhe
    roh_surmiran Romansch (Surmiran) No
    roh_sursilv Romansch (Sursilvan) No
    roh_sutsilv Romansch (Sutsilvan) No
    roh_vallader Romansch (Vallader) No
    ron_1953 Romanian (1953) rum
    ron_1993 Romanian (1993) No
    ron_2006 Romanian (2006) No
    run Rundi rud1
    rus Russian rus
    sag Sango saj
    sah Yakut sah
    san Sanskrit skt
    sco Scots sco
    sey Secoya 1123
    shk Shilluk shk
    shn Shan sjn
    shp Shipibo-Conibo shp
    sin Sinhala snh
    skr Seraiki skr
    slk Slovak slo
    slv Slovenian slv
    sme Saami, North lpi
    smo Samoan smy
    sna Shona shd
    snk Soninke snn
    snn Siona 1121
    som Somali som
    sot Sotho, Southern sso
    spa Spanish spn
    src Sardinian, Logudorese srd
    srp_cyrl Serbian (Cyrillic) src5
    srp_latn Serbian (Latin) src3
    srr Serer-Sine ses
    ssw Swati swz1
    suk Sukuma sua
    sun Sunda suo
    sus Susu sus
    swb Comorian, Maore swb
    swe Swedish swd
    swh Swahili swa
    tah Tahitian tht
    tam Tamil tcv
    tam_LK Tamil (Sri Lanka) No
    tat Tatar ttr
    tbz Ditammari tbz
    tca Ticuna tca
    tel Telugu tcw
    tem Themne tej
    tet Tetun ttm
    tgk Tajiki pet
    tgl Tagalog tgl
    tha Thai thj
    tha2 Thai (2) No
    tir Tigrigna tgn
    tiv Tiv tiv
    tly Talysh tly
    tob Toba tob
    toi Tonga toi
    toj Tojolabal toj
    ton Tongan tov
    top Totonac, Papantla top
    tpi Tok Pisin pdg
    tsn Tswana tsw
    tso_MZ Tsonga (Mozambique) tso
    tso_ZW Tsonga (Zimbabwe) tso1
    tsz Purepecha 1112
    tuk_cyrl Turkmen (Cyrillic) tck
    tuk_latn Turkmen (Latin) No
    tur Turkish trk
    tyv Tuva tyv
    tzc Tzotzil (Chamula) tzc
    tzh Tzeltal, Oxchuc tzc1
    tzm Tamazight, Central Atlas tzm
    uig_arab Uyghur (Arabic) uig
    uig_latn Uyghur (Latin) No
    ukr Ukrainian ukr
    umb Umbundu mnf
    ura Urarina ura
    urd Urdu urd
    urd_2 Urdu (2) urd
    uzn_cyrl Uzbek, Northern (Cyrillic) uzb1
    uzn_latn Uzbek, Northern (Latin) uzb
    vai Vai vai
    vec Venetian vec
    ven Venda tsh
    ven2 Venda ven
    vep Veps vep
    vie Vietnamese vie
    vmw Makhuwa vmw
    war Waray-Waray wry
    wln Walloon frn1
    wol Wolof wol
    wwa Waama ako
    xho Xhosa xos
    xsm Kasem kas
    yad Yagua yad
    yao Yao yao
    yap Yapese yps
    ydd Yiddish, Eastern ydd
    ykg Yukaghir, Northern ykg
    yor Yoruba yor
    yua Maya, Yucatán yua
    zam Zapotec, Miahuatlán zam
    zdj Comorian, Ngazidja zdj
    zgh Tamazight, Standard Morocan ama
    zro Záparo 1124
    ztu Zapotec, Güilá ztu1
    zul Zulu zuu

    License

    MIT © Titus Wormer

    Install

    npm i trigrams

    DownloadsWeekly Downloads

    7

    Version

    5.0.0

    License

    MIT

    Unpacked Size

    3.74 MB

    Total Files

    11

    Last publish

    Collaborators

    • wooorm