markovian-nlp

7.0.4 • Public • Published

markovian-nlp

build status npm dependencies npm dev dependencies license npm bundle size (minified) npm bundle size (minified + gzip) node version compatibility npm current version

Quick start

As an isomorphic JavaScript package, there are multiple ways for clients, servers, and bundlers to start using this library. Several methods do not require installation.

RunKit

RunKit provides one of the least difficult ways to get started:

CodePen

Declare imports in the JS section to get started:

import {
  ngramsDistribution,
  sentences,
} from 'https://unpkg.com/markovian-nlp@latest?module';
const sentence = sentences({ document: 'oh me, oh my' });
console.log(sentence);
// example output: 'oh me oh me oh my'

Browsers

Insert the following element within the <head> tag of an HTML document:

<script src="https://unpkg.com/markovian-nlp@latest"></script>

After the script is loaded, the markovian browser global is exposed:

const sentence = markovian.sentences({ document: 'oh me, oh my' });
console.log(sentence);
// example output: ['oh me oh me oh my']

Node.js

With npm installed, run terminal command:

npm i markovian-nlp

Once installed, declare method imports at the top of each JavaScript file they will be used.

ES2015

Recommended

import {
  ngramsDistribution,
  sentences,
} from 'markovian-nlp';

CommonJS

const {
  ngramsDistribution,
  sentences,
} = require('markovian-nlp');

Usage

Markov text generation

Generate text sentences from a Markov process.

Potential applications: Natural language generation

Generate sentences

Optionally providing a seed generates deterministic sentences.

In this example, document is text from this source:

sentences({
  count: 3,
  document: 'That there is constant succession and flux of ideas in our minds...',
  seed: 1,
});
 
// output: [
//   'i would promote introduce a constant succession and hindering the path...',
//   'he that train they seem to be glad to be done as may be avoided of our thoughts...',
//   'this wandering of attention and yet for ought i know this wandering thoughts i would promote...',
// ]

View n-grams distribution

View the n-grams distribution of text.

Potential applications: Markov models

ngramsDistribution('birds have featured in culture and art since prehistoric times');
 
// output: {
//   and: { _end: 0, _start: 0, art: 1 },
//   art: { _end: 0, _start: 0, since: 1 },
//   birds: { _end: 0, _start: 1, have: 1 },
//   culture: { _end: 0, _start: 0, and: 1 },
//   featured: { _end: 0, _start: 0, in: 1 },
//   have: { _end: 0, _start: 0, featured: 1 },
//   in: { _end: 0, _start: 0, culture: 1 },
//   prehistoric: { _end: 0, _start: 0, times: 1 },
//   since: { _end: 0, _start: 0, prehistoric: 1 },
//   times: { _end: 1, _start: 0 },
// }

Each number represents the sum of occurrences.

startgram endgram bigrams
"birds" "times" all remaining keys ("have featured", "featured in", etc.)

API

ngramsDistribution(document || ngramsDistribution)

ngramsDistribution(Array(document || ngramsDistribution[, ...]))

Input
type description
String document (corpus or text)
Object ngramsDistribution (equivalent to identity, i.e.: this method's output)
Array[Strings...] combine multiple document
Array[Objects...] combine multiple ngramsDistribution
Array[Strings, Objects...] combine multiple document and ngramsDistribution
Return value
type description
Object distributions of unigrams to startgrams, endgrams, and following bigrams
// pseudocode signature representation (does not run)
ngramsDistribution(document) => ({
  ...unigrams: {
    ...{ ...bigram: bigramsDistribution },
    _end: endgramsDistribution,
    _start: startgramsDistribution,
  },
});

sentences({ distribution || document[, count][, seed] })

Input
user-defined parameter type optional default value implements description
options.count Number true 1 Number of sentences to output.
options.distribution Object required if options.document omitted n-grams distribution used in place of text.
options.document String required if options.distribution omitted compromise(document) Text used in place of n-grams distribution.
options.seed Number true undefined Chance(seed) Leave undefined (default) for nondeterministic results, or specify seed for deterministic results.
Return value
type description
Array[Strings...] generated sentences

Glossary

Learn more about computational linguistics and natural language processing (NLP) on Wikipedia.

The following terms are used in the API documentation:

term description
bigram 2-gram sequence
deterministic repeatable, non-random
endgram final gram in a sequence
n-gram contiguous gram (word) sequence
startgram first gram in a sequence
unigram 1-gram sequence

Package Sidebar

Install

npm i markovian-nlp

Weekly Downloads

38

Version

7.0.4

License

MIT

Unpacked Size

1.88 MB

Total Files

7

Last publish

Collaborators

  • stassi