Open Refine reconciliation service for PeriodO data
The commands below must be typed into a terminal window.
Install the reconciler:
npm install -g periodo-reconciler
Download the PeriodO data (or some subset that you want to reconcile against) and put it somewhere.
Run the reconciliation server, giving it the path to where you put the PeriodO data:
You should see:
Reconciling against some/path/p0d.json Reconciliation server running on http://localhost:8142
Now open OpenRefine.
Create Project > Choose Files and choose your CSV file.
Next > Create Project to finish creating the project.
Click on the down arrow in the header of the column you want to reconcile (the column with period names) and choose Reconcile > Start Reconciling.
You should see a modal dialog. Click Add Standard Service at the bottom.
Enter the url
http://localhost:8142and click Add Service.
You should now see PeriodO reconciliation service selected under Services on the left. Click the tag icon next to Services to close this drawer.
Under where it says Also use relevant details from other columns you can optionally choose columns that have location names, start years, or end years for further constraining your reconciliation queries. For example, if you had place names in a column named
Place, you would click the checkbox next to Place and also enter the text
Spatial coveragein the (autocompleting) text input next to the checkbox.
Click Start Reconciling.
How reconciliation works
To reconcile a period term against PeriodO data, one needs
- the period term, e.g.
Late Cypriot III,
- (optionally) a place name, e.g.
- (optionally) a single number denoting a Gregorian calendar year, e.g.
-1200, or a pair of such numbers delimiting an interval, e.g.
Each of these pieces of data (if provided) is then reconciled against the appropriate fields of PeriodO period definitions. Each reconciliation (terminological, spatial, and temporal) produces a ranked list of matching periods. These ranked lists are then combined by taking their intersection, and combining the rankings using the Schulze method.
Matching period terms against periods' labels
The period term is matched against the preferred and alternate labels of each period definition. If there are multiple tokens in the term (e.g.
Late Cypriot III has three tokens), any of the tokens can match (i.e. it is a Boolean
OR query). Period definitions with matches in the preferred label are ranked higher than ones with matches in the alternate labels.
Matching place names against periods' spatial coverages
The place name (if provided) is matched against the spatial coverage description and linked spatial entity labels of each period definition. If there are multiple tokens in the place name, any of the tokens can match (i.e. it is a Boolean
Matching years or year ranges again periods' temporal extents
The year or year range (if provided) is matched again the temporal extent of each period definition. Single years match if they fall within the widest temporal extent for the period, plus one century on either side. Year ranges match to the extent that they overlap with the widest temporal extent for the period, plus one century on either side.