! This is the readme for markov-strings 3.x.x. - The docs for the older 2.x.x are here !
Markov-strings
A simplistic Markov chain text generator. Give it an array of strings, and it will output a randomly generated string.
This module was created for the Twitter bot @BelgicaNews.
Prerequisites
Built and tested with NodeJS 12
Installing
npm install --save markov-strings
Usage
const Markov = default// or const data = /* insert a few hundreds/thousands sentences here */ // Build the Markov generatorconst markov = stateSize: 2 // Add data for the generatormarkov const options = maxTries: 20 // Give up if I don't have a sentence after 20 tries (default is 10) // If you want to get seeded results, you can provide an external PRNG. prng: Mathrandom // Default value if left empty // You'll often need to manually filter raw results to get something that fits your needs. { return resultstringlength >= 5 && // At least 5 words resultstring // End sentences with a dot. } // Generate a sentenceconst result = markovconsole/*{ string: 'lorem ipsum dolor sit amet etc.', score: 42, tries: 5, refs: [ an array of objects ]}*/
Markov-strings is built in TypeScript, and exports several types to help you. Take a look at the source to see how it works.
API
new Markov([options])
Create a generator instance.
options
stateSize: number
The stateSize
is the number of words for each "link" of the generated sentence. 1
will output gibberish sentences without much sense. 2
is a sensible default for most cases. 3
and more can create good sentences if you have a corpus that allows it.
.addData(data)
To function correctly, the Markov generator needs its internal data to be correctly structured. .addData(data)
allows you add raw data, that is automatically formatted to fit the internal structure.
You can call .addData(data)
as often as you need, with new data each time (!). Multiple calls of .addData()
with the same data is not recommended, because it will skew the random generation of results.
data
string | Array< string: string >
data
is an array of strings (sentences), or an array of objects. If you wish to use objects, each one must have a string
attribute. The bigger the array, the better and more various the results.
Examples:
'lorem ipsum' 'dolor sit amet'
or
string: 'lorem ipsum' attr: 'value' string: 'dolor sit amet' attr: 'other value'
The additionnal data passed with objects will be returned in the refs
array of the generated sentence.
.generate([options])
Returns an object of type MarkovResult
:
The refs
array will contain all objects that have been used to build the sentence. May be useful to fetch meta data or make stats.
Since .generate()
can potentially take several seconds or more, a non-blocking variant .generateAsync()
is conveniently available if you need it.
options
.export()
and .import(data)
You can export and import the markov built corpus. The exported data is a serializable object, and must be deserialized before being re-imported.
Changelog
3.0.0
Refactoring to facilitate iterative construction of the corpus (multiple .addData()
instead of a one-time buildCorpus()
), and export/import of corpus internal data.
2.1.0
- Add an optionnal
prng
parameter at generation to use a specific Pseudo Random Number Generator
2.0.4
- Dependencies update
2.0.0
- Refactoring with breaking changes
- The constructor and generator take two different options objects
- Most of generator options are gone, except
filter
andmaxTries
- Tests have been rewritten with jest, in TypeScript
1.5.0
- Code rewritten in TypeScript. You can now
import MarkovGenerator from 'markov-strings'
1.4.0
- New
filter()
method, thanks @flpvsk
1.3.4 - 1.3.5
- Dependencies update
1.3.3
- Updated README. Version bump for npm
1.3.2
- Fixed an infinite loop bug
- Performance improvement
1.3.1
- Updated README example
- Removed a useless line
1.3.0
- New feature: the generator now accepts arrays of objects, and tells the user which objects were used to build a sentence
- Fixed all unit tests
- Added a changelog
Running the tests
npm test