Orolo
Yet another natural language dates parser for JavaScript with a focus on understanding dates and date ranges in a sentence. Orolo is an aggressive parser designed to work on sentenses that are known to contain dates.
Why
There are very good NLP date parses for JavaScript:
There are also strong general purpose NLP frameworks that some with pre-build named entity recognition for dates:
Unfortunately, I couldn't find one that would properly understand these sentences:
- Will be in New York on May 5th, 6th, and 7th
- Will be in London from june third till june sixth
- I was in SFO this past Tue and Wed
and that's how orolo was born.
Install
npm install --save orolo
Usage
const Orolo = ; const orolo = ;const dates = orolo;
If you know that the dates in the sentence are not relative to today:
const context = '2018-05-01';const dates = orolo;
How It Works
Orolo works in two stages. First, it parses the sentence looking for date tokens. Then it computes the tokens by arranging them into a computable sequence.
Locales
Right now orolo only supports English but it is designed to support other languages as well. You will need to implement the ILocale
interface. The best way to do it at the moment is to take a look at how EnglishLocale
is implemented. In short, you specify how to tokenize a sentence, you also specify how many language tokens (words) can a date token span, and then you implement token detectors for each token type that orolo supports.
Aggressive Parsing
Orolo takes everything that looks like a valid date token and then attempts to compute a date. The following two sentenses will compute down to May 5th
:
It happened on May 5
I bought 5 shirts in May
There is no logic currently to look at either dependency graph or proximity of tokens to each other.
Parsed Sequence
If you would like to take advantage of parsing logic but don't want/need orolo to compute the dates:
const parse EnglishLocale = ;const sequence = ; // sequence.tokens will return an Array of parsed tokens