This should be divided into 2 libraries:
- time parsing
npm install epic
TODO browser version.
Period objects. Periods can have start and/or end times, and/or durations. Time is fuzzy. There is also the
The api should look something like this:
var epic = require'epic';assertequalepicannotate'I went to school the day before yesterday...''I went to school <time>the day before yesterday</time>...';epicperiodhours: 12toInt;epichours12toInt;
Internals of the API
var epic = require'epic'Period = epicPeriodInterval = epicIntervalInstant = epicInstant;// If you add this period to the 1st February (ISO) then you will get the 1st March.// If you add the same period to the 1st March you will get the 1st April.// But the duration added (in milliseconds) in these two cases is very different.var period = months: 1;var instant = Instantparse'Feb 1, 2012';instantaddperiod;instanttoString; // "Mar 1, 2012"// more complicated// this says the winter solstice is on Dec 21, 2012,// sometime between 6:00pm and 11:59pm (just made those numbers up)var winterSolstice ='Dec 21, 2012 at 6:00pm'hours: 6 // better name is "Duration";// then you could go to next years winter solstice:winterSolsticeaddPeriodyears1// and it should be Dec 21, 2013 from 6pm - 11:59pm
Then some BC calculations:
var originOfAgriculture = epicparse'10,000 BCE';// which you could manually set up like this (but would have to know the internals of how BCE is handled)var originOfAgriculture = ;originOfAgriculturesubtractyears: 2012subtractyears: 10000;// there should be better ways to do this.http://mail-archives.apache.org/mod_mbox/xml-xalan-dev/201204.mbox/%3C4F8B913C.email@example.com%3E
// 1 million years agovar mya = years: -1000000;var mya = years: -1000000;// periods can have start and end dates, or they don't have to.var triassicPeriod = years: -250 * 1000000 years: -200 * 1000000;triassicPeriodstartTimeaddyears: 10;var triassicPeriod = -250 -200 'mya';var triassicPeriod = -250 'ma' 200 'ma';// maybe "unit" specifies the default operator or somethingvar triassicPeriod = -250 'mya' -200 'mya';triassicPeriodstartTimeadd100var extinction = 65 'mya';// how many years after the triassic period?var periodBetween = triassicPeriodendTime extinction;var millionYearsAfter = periodBetweenduration'ma'; // 200 - 65 = 135 mavar yearsAfter = periodBetweenduration'years'; // 200 - 65 = 135 ma
This should print the date in a standard format that can be saved to a database. Some ideas are: (TODO)
John left by the time he leftJohn left by the time he ateJohn ate by the time he leftJohn left by the time he ateApril is usually wet. genericI was born on a Tuesday. indefiniteI had swimming classes on every Tuesday in 1999. gapped intervalWinter 1999 was extremely severe. vagueWe got married about three years ago. approximateThe first three days of every month are always the busiest. set of intervalsThe movie is two hours long. unanchored intervalShe left five days after he came back. event-dependantCategory ExamplesAdverbs simultaneously, currently, lately, today, yesterday, tomorrowFrequencyAdverbsa lot, always, ever, frequently, hardly ever, never, normally, occasionally,often, frequently, rarely, sometimes, usually, hourly, daily, weekly ormonthlyPrepositionalPhraseon Monday, at 5 o'clock, for one hour, in January, over many years,during the weekend, after the meeting, before 8pm, between 11am and1pm, by Monday morning, since 1978, until January 2006, from 1939 to1945, within one hour, following the meetingPostpositionalPhrasefive months ago, five months hence, five months on, the whole nightthroughOther Adverbial Phrases: later than ever before, at least five years, all spring, on Tuesday at noonAdverbial Clause: when she saw the snake, as soon as I have any newsNoun Phrase coming weeks, a beautiful morning, cold wintersJohn visited his parents twice in two years.John learnt Japanese for half an hour every morning for a month.John washed cars from morning till night from June till August.John had arrived on Tuesday at noon.On Tuesday, John had arrived at noon.Last week, John had arrived 3 days ago.John had arrived at noon on Tuesday.John had arrived 3 days ago last week.Category Lexical Triggers Non-TriggersNoun minute, afternoon, midnight, day,night, weekend, month, summer,season, quarter, year, decade, cen-tury, millennium, era, semester, fu-ture, past, time, period, pointinstant, jiy, episode, occasion,tenure, timetable, reign, light year,megawatt hour, lifetime, historyAdjective recent, former, current, future, past,daily, monthly, biannual, semian-nual, daytime, daylong, onetime,ago, preseason, short-term, long-termearly, ahead, next, subsequent, fre-quent, perpetual, later, contempo-rary, simultaneous, preceding, pre-vious, existing, modernAdverb currently, lately, hourly, daily,monthly, ago (+ adverbial forms ofadjective triggers)earlier, immediately, instantly,forthwith, meanwhile, heretofore,previously, next, beforehand, fol-lowing, later, soon, sooner, shortly,eventually, occasionally, once, still,again, timely, wheneverTimenoun/adverbnow, today, yesterday, tomorrowNumber 3, three, third, sixtiesProper name Monday, January, New Year's Eve,Washing's Birthday, SolsticePronouns that, then, it (only pronouns thatco-refer with a markable expression)Time patterns 8:00, 12/02/2000, 1994, 1960sthat cold day the next day late last nightearlier that year next summer recent decadesnumerous Saturdays more than a month no less than 60 daysjust a year ago only one hour long its own futurethe countrys future just a year ago only one hour longfive years old a few weeks later hours earlierfive days after he came back three decades ago the second-best quarter evermonths of renewed hostility a historic day for the European enterprisenearly four decades of experiencePerhaps in the next two weeks.How 'bout the afternoon of Monday the ninth?Okay, how 'bout Tuesday March the sixteenth sometime after twelve o'clock pm?9th Sep. 1987an hour and 30 minutesTuesday and ThursdayWednesday or Friday4 o'clocknext monthexactly one minute agothe 2nd Sunday in MayWednesday from 3pm to 5pmin the past 3 yearsevery 2 minutes and 30 secondsfrom 3pm to 5pmthe rest of the yearsometime between 3pm and 5pmother than Wednesdayless than 1 hour and 30 minutessometime before Sept. 9, 1987sometime in 1987January to March and May 20078am, except Mondays 9 amthe following monththe second Tuesday after Eastertwo consecutive Sundaystodaythree Mondaysall Mondays in every Mayall Mondays in any Mayany Monday in every Maythe courses when the student has free timeWe met in July last yearOn Monday and Tuesdaythree months earlierfrom 3pm to 6pmThey are leaving on vacation two weeks from next Tuesday.A major earthquake struck Los Angeles three years ago today.This year’s summer was unusually hot.Bacon tutored an English student some Thursdays in 1998.She spent the following twelve years in various health care positions around Minnesota.Ash WednesdayEaster SundayChristmas DayAdnan Pachachi, a onetime foreign minister who returned to Iraq on May 6 after 33 years in exile...Yeah. I'm leaving on Monday and coming back on the thirtieth, so um
- Fully specified temporal expressions, which provide all the information necessary in order to identify the point or period of time they are referring to; e.g., June 11, 1989, or the Summer of 2002.
- Underspecified temporal expressions, which require the use of some contextual infor- mation in order to interpret the point in time they are referring to; e.g., early in the morning, Monday, in recent days, few days ago, two weeks from next Tuesday, next September, the current month, last year, a decade ago.
- Durations, such as: three months and two years.
- DANTE is a rule-based system that consists of two main processing modules: a recognizer and an interpreter. The recognizer finds occurrences of temporal expressionsin documents, determines their full extents in text, and analyses their local meaning to generate their LTIMEX values. Then, for each recognized temporal expression the interpreter determines its global semantic value.
- SUTIME is a rule-based temporal tagger built on regular expression patterns. Temporal expressions are bounded in their complexity, so many of them can be captured using ﬁnite automata. As shown by systems such as FASTUS (Hobbs et al., 1997), a cascade of ﬁnite automata can be very effective at extracting information from text. With SUTIME, we follow a similar staged strategy of (i) building up patterns over individual words to ﬁnd numerical expressions; then (ii) using patterns over words and numerical expressions to ﬁnd simple temporal expressions; and ﬁnally (iii) forming composite patterns over the discovered temporal expressions.
- From Conceptual Time to Linguistic Time
- Semantics of Time-Varying Information
- Reasoning across time and the syntacticization of semantics (has example sentences)
- TIMEN: An Open Temporal Expression Normalisation Resource
- Massively Increasing TIMEX3 Resources: A Transduction Approach
- TIDES 2005 Standard for the Annotation of Temporal Expressions
- ACE Time Normalization (TERN) 2004 English Evaluation Data V1.0
- Broad-Coverage Rule-Based Processing of Temporal Expressions (PhD Thesis from 2012, SOLID)
- A Pilot Study on Annotating Temporal Relations in Text
- A Time Calculus for Natural Language (TCNL)
- The Annotation of Temporal Information in Natural Language Sentences
- A Corpus-based Study of Temporal Signals
- The Language of Time (Book)
- Developing Language Processing Components with GATE - Version 7 (and JAPE)
- JAPE: Regular Expressions over Annotations
- FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text
- Named Entity Extraction from Speech
- SUTIME: A Library for Recognizing and Normalizing Time Expressions
- http://timeml.org/site/timebank/timebank.html (the tempeval2 download has great docs on how to parse events)
- Currently, the standard temporal annotation scheme is TimeML which includes a speciﬁcation of the TIMEX3 standard.
- Explicit, absolute, or self-contained: These can be directly translated to a particular granularity date/time.
- Implicit, relative, or context-dependent: These need the document creation time (deictic) or a previously mentioned temporal reference/anchoring (anaphoric) to obtain a explicit date/time.
- Durative: Describing a bounded interval (or duration) that is not inherently anchored to a timeline.
- Set or frequency: Regularly recurring times, such as “every Christmas” or “each Tuesday”.
- Vague: generic mentions like “recently” or “today” in “today’s fashions”; see TIDES standard Section 4.6 (Ferro et al., 2005).
- timex / timexes
- temporal expressions
- Timex Normalisation Taxonomy
- "natural language" morphological corpus independent filetype:pdf
- DANTE (Detection and Normalisation of Temporal Expressions)