ITALIAN-HUMAN-TO-DATE
Convert Italian language strings to Javascript dates.
- Includes a wide range of possible sentences
- Returns a range of dates if the original text contains a time period (e.g. the last month)
- Can receive a readable stream as input and writable stream for output
Installation
npm install italian-human-to-date
Basic Example
import DateExtractor from 'italian-human-to-date';
const DE = new DateExtractor();
const response = await DE.extract('oggi');
console.log(response);
// PRINT
// {
// origin: 'oggi',
// dates: [ 2022-04-19T22:00:00.000Z ],
// ranges: [],
// adjustedTokes: [ 'oggi' ],
// residualTokens: [],
// usedTokens: [ 'oggi' ]
// }
Multiple Example
import DateExtractor from 'italian-human-to-date';
const DE = new DateExtractor();
const test = [
'oggi',
'ieri',
'domani',
'evento di sabato 26 marzo 2021',
'appuntamento 29 marzo',
'domenica giorno 27 ieri',
'nato il 20-11-1973',
'mese scorso',
'ultimo mese',
'mese prossimo',
'anno scorso',
'anno prossimo',
'ultima settimana',
'settimana scorsa',
'settimana prossima',
'dopodomani',
'dopo domani',
'altroieri',
'avantieri',
'altro ieri',
'tredici giorni fa',
'tra un giorno',
'sei mesi fa',
'fra dieci giorni',
'10 anni fa',
'ultimi tre giorni',
'prossimi 3 giorni',
'ultimi due anni',
'prossimi sei mesi',
];
DE.on('data', (data) => {
console.log(data);
console.log(' ');
});
test.forEach((data) => DE.extract(data));
// PRINT
// {
// origin: 'evento di sabato 26 marzo 2021',
// dates: [ 2021-03-25T23:00:00.000Z, 2022-04-15T22:00:00.000Z ],
// ranges: [],
// adjustedTokes: [ 'event', 'sab', '26', 'marz', '2021' ],
// residualTokens: [ 'event' ],
// usedTokens: [ 'sab', '26', 'marz', '2021' ]
// }
// ...
// {
// origin: 'prossimi sei mesi',
// dates: [],
// ranges: [
// { start: 2022-04-19T22:00:00.000Z, end: 2022-10-31T22:59:59.999Z }
// ],
// adjustedTokes: [ 'prossim', '6', 'mes' ],
// residualTokens: [],
// usedTokens: [ 'prossim', '6', 'mes' ]
// }
// ...
Constructor
The Extractor generally works without any need for customization; however, it is possible to customize a number of parameters.
new DateExtractor([{ params }]);
Params Object
Param | Description | Default |
---|---|---|
now | The reference date | Date.now() |
timeZone | Time Zone to use | 'Europe/Rome' |
sequence | The extractor sub-modules sequence | [ 'absolute', 'canonical', 'relativeStep', 'relativeRange', 'relativeAbsolute', 'relativePeriod', 'relativeWeekDay', 'relative', ] |
vocabulary | The vocabulary of Italian terms to use to extract dates | Default Vocabulary |
textProcessor | The text preprocessing sequence | { stemmer: 'PorterStemmerIt', stopwords: true, duplicate: false, diacritics: true, punctuation: true, lowercase: true, trim: true, tokenize: true } |
verbose | To let Extractor print out every processing step | false |
Extract Method
The asynchronous Extract method can take as input: a string, an array of strings, a readable stream.
await DE.extract(data, [{ params }]);
And it can be configured with the following parameters:
Param | Description | Default |
---|---|---|
now | The reference date [override constructor time zone] | Date.now() |
timeZone | Time Zone to use [override constructor time zone] | 'Europe/Rome' |
cb | A callback to invoke on every data extraction | |
outStream | A writable stream to write data to on every extraction |
Return Object
The extract method returns an object for each input string. The object has the following values:
Key | Description | Example |
---|---|---|
origin | The original string | prossimi sei mesi |
dates | Array of extracted JS dates | [ 2021-03-25T23:00:00.000Z, 2022-04-15T22:00:00.000Z ] |
ranges | Array of object of Date Ranges extracted | [ { start: 2022-04-19T22:00:00.000Z, end: 2022-10-31T22:59:59.999Z } ] |
adjustedTokes | An Array of tokes after text processing | [ 'prossim', '6', 'mes' ] |
residualTokens | An Array of tokes not considered for date extraction | [ 'event' ] |
usedTokens | An Array of tokes considered for date extraction | [ 'prossim', '6', 'mes' ] |
Events
The Extractor emits an event with label "date" at each data extraction
DE.on('data', (data) => {
console.log(data);
console.log(' ');
});