ibm-igc-lineage

0.6.1 • Public • Published

README

Consists of the following functionality:

  • Node.js module for interacting with lineage (documentation below)
  • Generic lineage flow extension hook (see documentation under src/com/ibm/iis/gov/services/README.md)
  • Sample code for producing extended lineage (see documentation under samples/README.md)

Node.js module

ibm-igc-lineage

Re-usable functions for handling lineage flow documents (XML) and operational metadata (OMD XML)

Meta

  • license: Apache-2.0

FlowHandler

FlowHandler class -- for handling IGC Flow Documents (XML)

Examples

// parses an XML flow document held in 'xmlString' as a string
var igclineage = require('ibm-igc-lineage');
var fh = new igclineage.FlowHandler();
fh.parseXML(xmlString);

parseXML

Parses an XML flow document

Parameters

getAssetName

Gets the name of an asset

Parameters

  • asset Asset

Returns string

getAssetRID

Gets the RID of an asset

Parameters

  • asset Asset

Returns string

getAssetById

Gets an asset by its unique flow XML ID (not RID)

Parameters

Returns Asset

getAssetNameById

Gets the name of an asset based on its unique flow XML ID (not RID)

Parameters

Returns string

getProjectNode

Gets the Transformation Project details

Returns Asset

getJobNode

Gets the Job details

Returns Asset

createFlowUnit

Creates a new flowUnit

Parameters

  • flowType string DESIGN or SYSTEM
  • xmlIdOfProcessor string the internal XML flow doc ID of the processing routine (ETL job, etc)
  • comment string? an optional comment to include on the flow

getEntryFlows

Gets the details for ENTRY flows (data store-to-DataStage)

Returns FlowList

getExitFlows

Gets the details for EXIT flows (DataStage-to-data store)

Returns FlowList

getSystemFlows

Gets the details for INSIDE flows (DataStage-to-DataStage)

Returns FlowList

getDesignFlows

Gets the details of DESIGN flows

Returns FlowList

getSubflows

Gets all of the subflows from a set of flows

Parameters

  • flows FlowList the set of flows for which to get subflows

Returns FlowList the subflows

getSubflowBySourceId

Gets a specific subflow based on its source

Parameters

  • flows FlowList the set of flows from which to get the subflow
  • sourceId string the sourceID of the subflow

Returns Flow the subflow

getSubflowsByTargetId

Gets a specific subflow based on its target

Parameters

  • flows FlowList the set of flows from which to get the subflow
  • targetId string the targetID of the subflow

Returns Flow the subflow

getParentAssetId

Gets the ID of the parent (reference) of the provided asset

Parameters

  • asset Asset

Returns string

getRepositoryIdFromDSSourceId

Gets the ID of the source repository that is mapped to the provided DataStage target

Parameters

  • entryFlows FlowList the set of ENTRY flows
  • DSSourceId string the DataStage target (targetID) of the ENTRY flow

Returns string the mapped source repository (sourceID) of the ENTRY flow

getRepositoryIdFromDSTargetId

Gets the ID of the target repository that is mapped from the provided DataStage source

Parameters

  • exitFlows FlowList the set of EXIT flows
  • DSTargetId string the DataStage source (sourceID) of the EXIT flow

Returns string the mapped target repository (targetID) of the EXIT flow

getTableIdentity

Gets the identity string (externalID) for the provided database table

Parameters

  • tblName string the name of the database table
  • schemaId string the ID of the parent database schema

Returns string

getColumnIdentity

Gets the identity string (externalID) for the provided database column

Parameters

  • colName string the name of the database column
  • tableId string the ID of the parent database table

Returns string

getColumnIdentityFromTableIdentity

Gets the database column identity string (externalID) from an existing database table identity string

Parameters

  • colName string the name of the database column
  • tableIdentity string the identity string (externalID) of the parent database table

Returns string

addAsset

Adds an asset to the flow XML

Parameters

  • className string the classname of the data type of the asset (e.g. ASCLModel.DatabaseField)
  • name string the name of the asset
  • rid string the RID of the asset, or a virtual identity (externalID)
  • xmlId string the unique ID of the asset within the XML flow document
  • matchByName string should be one of ['true', 'false']
  • virtualOnly string should be one of ['true', 'false']
  • parentType string? the classname of the asset's parent data type (e.g. ASCLModel.DatabaseTable)
  • parentId string? the unique ID of the asset's parent within the XML flow document
  • additionalAttrs Array<Object>? any extra attributes to set on the asset, each element of the array being { name: "NameOfAttr", value: "ValueOfAttr" }

addFlow

Adds a flow to the flow XML

Parameters

  • flowsSection FlowList the flows area into which to add the flow
  • existingFlow Flow? an existing flow to update or replace
  • sourceIDs string the sourceIDs to use in the flow mapping
  • targetIDs string the targetIDs to use in the flow mapping
  • comment string the comment to add to the flow mapping
  • bReplace boolean true if any existing flow should be replaced, false if the mappings should be appended

getCustomisedXML

Retrieves the flow XML, including any modifications that have been made (added assets, flows)

Returns string the full XML of the flow document

OMDHandler

OMDHandler class -- for handling IGC run-time, operational metadata documents (OMD XML)

Examples

// parses an operational metadata XML document held in 'xmlString' as a string
var igclineage = require('ibm-igc-lineage');
var omd = new igclineage.OMDHandler();
omd.parseOMD(xmlString);

parseOMD

Parses an Operational Metadata (OMD) flow document

Parameters

getRunMessage

Gets the information message resulting from the execution of the job that produced this operational metadata

Returns string

getRunStatus

Gets the status code from the execution of the job that produced this operational metadata

Returns string

getDesign

Gets the details for the operational metadata job's design

Returns SoftwareResourceLocator

getExecutable

Gets the details for the operational metadata job's executable

Returns SoftwareResourceLocator

getReadEvent

Gets the details for OMD Read Event (data movement)

Returns Event

getWriteEvent

Gets the details for OMD Write Event (data movement)

Returns Event

getRowCount

Gets the number of records processed by the event

Parameters

Returns int

getDataResourceForEvent

Gets the data resource (table-level details) processed by the event

Parameters

Returns DataResourceLocator

getDataCollectionForEvent

Gets the data collection (column-level details) processed by the event

Parameters

Returns DataCollection

getDataResourceHost

Gets the hostname of the data resource

Parameters

  • dataResource DataResourceLocator

Returns string

getDataResourceStore

Gets the data store name of the data resource

Parameters

  • dataResource DataResourceLocator

Returns string

getDataResourceSchema

Gets the schema of the data resource

Parameters

  • dataResource DataResourceLocator

Returns string

getDataResourceTable

Gets the table name of the data resource

Parameters

  • dataResource DataResourceLocator

Returns string

getDataResourceIdentity

Gets the full identity string (::-delimited) of the data resource

Parameters

  • dataResource DataResourceLocator

Returns string

getDataCollectionColumns

Gets an array of all column names within the data collection

Parameters

  • dataCollection DataCollection

Returns Array<string>

replaceHostname

Replaces the hostname in the operational metadata everywhere, making it loadable in a target environment

Parameters

  • targetHostname string the engine tier hostname of the target environment (where the operational metadata is to be loaded)

getUniqueRuntimeIdentity

Returns a unique identity object for the runtime information received; specifically a set of unique parameters as could be used to uniquely identify an object in IGC's lineage

Returns any Object

getCustomisedOMD

Retrieves the operational metadata XML, including any modifications that have been made (i.e. replaced hostnames)

Returns string the full XML of the operational metadata

LineageWorkbook

LineageWorkbook class -- for capturing information about data lineage, manually

loadFromFile

Loads workbook from the provided XLSX file

Parameters

addValidationsToLineageSheet

Add entry assistance (drop-down list) validations to the lineage sheet. Note: should only be done after populating the workbook with existing assets

populateWithExistingAssets

Populate the lineage workbook with existing assets from an environment

Parameters

  • igcrest ibm-igc-rest the instantiation of an ibm-igc-rest object, with connection already configured
  • callback completeCallback callback that returns once population is completed

writeTemplate

Write out the template to the specified file

Parameters

generateFlowXML

Generate a flow XML document that contains the lineage definitions of this workbook

Returns string XML flow document representation of the lineage definitions in the workbook

uploadFlowXMLToIGC

Upload the lineage flow XML for the workbook to IGC

Parameters

  • igcrest ibm-igc-rest the instantiation of an ibm-igc-rest object, with connection already configured
  • callback completeCallback

writeFlowXML

Write out the lineage flow XML for the workbook to the specified file

Parameters

completeCallback

This callback is invoked as the result of work completing, providing a status.

Type: Function

Parameters

  • errorMessage string any error message, or null if no errors

AssetTypeFactory

AssetTypeFactory class -- for encapsulating information about asset types

Versions

Current Tags

  • Version
    Downloads (Last 7 Days)
    • Tag
  • 0.6.1
    2
    • latest

Version History

Package Sidebar

Install

npm i ibm-igc-lineage

Weekly Downloads

2

Version

0.6.1

License

Apache-2.0

Last publish

Collaborators

  • cgrote