This package has been deprecated

Author message:

Use @natlibfi/marc-record-merge instead

marc-record-merge

3.0.1 • Public • Published

MARC record merge NPM Version Build Status Test Coverage

A configurable Javascript module for merging MARC records.

Installation

Clone the sources and install the package (In the source directory) on command line using npm:

npm install

Testing

Run the following NPM script to lint, test and check coverage of the code:

 
npm run check
 

Usage

The module returns a factory function that takes configuration object as the first (mandatory) argument. The second argument (optional) is an object specifying plugin functions. The factory returns a function that takes two MARC records (Instances of marc-record-js) as arguments. The first one is the preferred record which will be used as a base records (All fields are taken from this record unless otherwise specified).

Node.js

var mergeRecords = require('marc-record-merge')(config),
record_merged = mergeRecords(record_preferred, record_other);

AMD

define(['marc-record-merge'], function(mergeFactory) {
 
  var mergeRecords = mergeFactory(config),
  record_merged = mergeRecords(record_preferred, record_other);
 
});

Browser globals

var mergeRecords = mergeMarcRecordsFactory(config),
record_merged = mergeRecords(record_preferred, record_other);

Configuration

The configuration object is a document conforming to the schema. The document looks like this:

{
  "fields": {
    "005": {
      "action": "controlfield"
    },
    "7..": {
      "action": "copy",
      "options": {
        "compareWithoutIndicators": true
      }
    }
  }
}
 

Each property of fields is a MARC field name or pattern (..5, 700). The value of the field property is an object which must contain a action property. Action-specific options are defined in options property.

The specified action is executed for each field in the other record that matches the field tag pattern.

Predefined actions

controlfield: Copy missing control fields from the other record.

copy: Copy fields from other record. The following options are supported:

  • mustBeIdentical: A boolean determing whether all subfields must be identical
  • compareWithoutIndicators: A boolean determing whether field indicators must be identical
  • compareWithout: An array of subfield codes. These subfields are filtered out from the comparison
  • combine: An array of subfields codes. These subfields will be combined into a single subfield
  • pick: Include subfields from the field that is not preserved.
    • subfields: An array of subfield codes (Mandatory)
    • missingOnly: A boolean determing whether only subfields missing from the target field should be picked
  • transformOnInequality: An object describing how to transform an inequal field. For this option to take effect the preferred record must have at least one field with the same tag name as the other record. The following properties are supported:
    • tag: Tag name of the new field (Mandatory)
    • drop: An array of subfields codes. These subfields are not included in the new field.
    • add: An object with subfield codes as keys and subfield values as values.
    • map: An object with new subfield codes as keys and the old subfield codes as values.

The action copies only fields that have no match in the preferred record:

  1. Attempt to find a corresponding field from preferred record
  2. Attempt to find an identical field (All options are applied to comparison)
- [Normalize fields](#field-normalization)
- Tag names must be identical
- Indicators must be identical
- Values must be identical
  1. Variable fields: Each subfield must have a matching subfield in the opposite record (Code and value identical).
  1. Control fields: The values must be identical
  1. Attempt to find a similar field if an identical field was not found (All options are applied to comparison)
- [Normalize fields](#field-normalization)
- Tag names must be identical
- Indicators must be identical
- Values comparison
  1. Variable fields: Either field's subfields must be a subset of the opposite subfields (There is a match for all of the opposite field's subfields)
  1. Control fields: The values must be identical
  1. Copy or do nothing

  2. No corresponding field was found

  3. If no fields in the preferred record with the same tag name were found, copy the field

  4. If fields with the same tag were found and option transformOnInequality is enabled, copy the other field using the specified transformations. Otherwise do nothing.

  5. Corresponding field was found. Check if the other field is deemed different and should be copied to the merged record (All options are applied to comparison)

  6. Normalize fields

  7. Check if the other field is a "proper" subset of the preferred field (Preferred field contains all of the other field's subfields and more)

  8. If it is, copy the field

  9. If it's not, keep the preferred field

  10. In both cases, merge the subfields that were not included in comparison by removing identical subfields

selectBetter: Selects the "better" of the two fields of each record. Cannot be used if the tag has multiple fields. The following options are supported:

  • requireFieldInBoth: A boolean determing whether the field must exist in both records to make changes
  • onlyIfMissing: A boolean determing whether the field will be selected from the other record only if it missing from preferred record
  • skipOnMultiple: A boolean determing whether to skip the action (And keep the preferred fields) if the are multiple fields of the same tag. Default behavior is to fail the processing
  • pick: Include subfields from the field that is not preserved.
    • subfields: An array of subfield codes (Mandatory)
    • missingOnly: A boolean determing whether only subfields missing from the target field should be picked
  • comparator: A subfield comparator function name

The better field is selected as follows:

  1. Normalize fields
  2. Check if both fields' subfields are considered equal (Using the comparator function)
  3. If equal, select the field that gets the most points (Fields get points for each subfield that has more characters than the corresponding subfield in the opposite field)
  4. If not equal, check if the other field is a "proper" subset of the preferred field (Preferred field contains all of the other field's subfields and more)
  5. If the other field is a subset of the preferred field, select the other field
  6. Otherwise select the preferred field

Field normalization

Field values (Variable field subfields or control field value) are normalized as follows:

  1. String is converted to lower case
  2. Punctuation (See the function removePunctuation in lib/main.js for the characters replaced) is replaced with whitespace. The resulting string is trimmed (Whitespace removed from both ends) and subsequent whitespace is reduced to single whitespace.
  3. Diacritics are replaced with their corresponding ASCII characters (See the variable DIACRITICS_REMOVAL_MAP in lib/main.js for mapping)

Predefined comparators

  • substring: Require both subfields to have equal codes and that at least one value is a substring of the other subfield's value
  • equality: Require both subfields to have stricly equal codes and values

Plugins

The second argument to the factory function is a optional object specifying plugin functions. The object can have one or both of the following properties:

actions

An object with action names as keys and functions as values. The action is a function which alters the merged record. It has the following signature:

function (record_merge, field_other, options, return_details) {}

Parameters:

  • record_merge: The merged record which is to be modified (Based on the preferred record)
  • field_other: The field from the other record which is the subject of the action
  • options: Action specific options
  • return_details: If this is defined, return action specific details about processing

The function's return value is not used unless return_detail is defined.

comparators

An object with comparator names as keys and functions as values. The comparator is a function which returns a boolean denoting whether two subfields are deemed equal. It has the following signature:

function (subfield1, subfield2) {}

Sorting

By default, new fields are added after the similar fields. This can be changed with the sort property:

{
  "fields": {
    "020": {
      "action": "copy"
    }
  },
  "sort": {
    "insert": "before",
    "indexes": {
      "CAT": 995
    }
  }
}

Properties

  • insert: Defines whether new fields are inserted before or after similar fields. Defaults to after.
  • indexes: An object with field tag patterns as keys and static sort indexes as values. By default, new fields are inserted by the tag's numeric index (If similar fields don't exist)

License and copyright

Copyright (c) 2015-2017 University Of Helsinki (The National Library Of Finland)

This project's source code is licensed under the terms of GNU Affero General Public License Version 3.

Package Sidebar

Install

npm i marc-record-merge

Weekly Downloads

1

Version

3.0.1

License

AGPL-3.0

Last publish

Collaborators

  • natlibfi
  • petuomin