README

ibm-ia-rest

Re-usable functions for interacting with Information Analyzer's REST API

Examples

// runs column analysis for any objects in Automated Profiling that have not been analyzed since the moment the script is run (new Date())
var iarest = require('ibm-ia-rest');
var commons = require('ibm-iis-commons');
var restConnect = new commons.RestConnection("isadmin", "isadmin", "hostname", "9445");
iarest.setConnection(restConnect);
iarest.getStaleAnalysisResults("Automated Profiling", new Date(), function(errStale, projectRID, aStaleSources) {
  iarest.runColumnAnalysisForDataSources(projectRID, aStaleSources, function(errExec, tamsAnalyzed) {
    // Note that the API returns async; if you want to busy-wait you need to poll events on Kafka
  });
});

Meta

license: Apache-2.0

setConnection

Set the connection for the REST API

Parameters

restConnect RestConnection RestConnection object, from ibm-iis-commons

makeRequest

Make a request against IA's REST API

Parameters

method string type of request, one of ['GET', 'PUT', 'POST', 'DELETE']
path string the path to the end-point (e.g. /ibm/iis/ia/api/...)
input string? any input for the request, i.e. for PUT, POST
inputType string? the type of input, if any provided ['text/xml', 'application/json']
callback requestCallback callback that handles the response
Throws any will throw an error if connectivity details are incomplete or there is a fatal error during the request

getAllItemsToIgnore

Retrieves a list of all items that should be ignored, i.e. where they are labelled with "Information Analyzer Ignore List"

Parameters

callback itemsToIgnoreCallback

addIADBToIgnoreList

Adds the IADB schema to a list of objects for Information Analyzer to ignore (to prevent them being added to projects or being analysed); this is accomplished by creating a label 'Information Analyzer Ignore List'

Parameters

callback requestCallback callback that handles the response

createOrUpdateAnalysisProject

Create or update an analysis project, to include ALL objects known to IGC that were updated after the date received -- necessary before any tasks can be executed

Parameters

name string name of the project
description string description of the project
updatedAfter Date? include into the project any objects in IGC last updated after this date
callback requestCallback callback that handles the response

getProjectList

Get a list of Information Analyzer projects

Parameters

callback listCallback callback that handles the response

getProjectDataSourceList

Get a list of all of the data sources in the specified Information Analyzer project

Parameters

projectName string
callback dataSourceListCallback callback that handles the response (will be entries with HOST||DB.SCHEMA.TABLE and HOST||PATH:FILE)

runColumnAnalysisForDataSources

Run a full column analysis against the list of data sources specificed (based on TAM RIDs)

Parameters

projectRID string the RID of the project in which to execute the analysis
aDataSources Array<Object> an array of data sources, as returned by getProjectDataSourceList
callback columnAnalysisCallback callback that handles the response

publishResultsForDataSources

Publish analysis results for the list of data sources specified

Parameters

projectRID string RID of the IA project
aTAMs Array<string> an array of TAM RIDs whose analysis should be published
callback requestCallback callback that handles the response

getStaleAnalysisResults

Retrieve previously published analysis results

Parameters

projectName string name of the IA project
timeToConsiderStale Date the time before which any analysis results should be considered stale
callback staleAnalysisCallback callback that handles the response

reindexThinClient

Issues a request to reindex Solr for any resutls to appear appropriately in the IA Thin Client

Parameters

batchSize int The batch size to retrieve information from the database. Increasing this size may improve performance but there is a possibility of reindex failure. The default is 25. The maximum value is 1000.
solrBatchSize int The batch size to use for Solr indexing. Increasing this size may improve performance. The default is 100. The maximum value is 1000.
upgrade boolean Specifies whether to upgrade the index schema from a previous version, and is a one time requirement when upgrading from one version of the thin client to another. The schema upgrade can be used to upgrade from any previous version of the thin client. The value true will upgrade the index schema. The value false is the default, and will not upgrade the index schema.
force boolean Specifies whether to force reindexing if indexing is already in process. The value true will force a reindex even if indexing is in process. The value false is the default, and prevents a reindex if indexing is already in progress. This option should be used if a previous reindex request is aborted for any reason. For example, if InfoSphere Information Server services tier system went offline, you would use this option.
callback reindexCallback status of the reindex ["REINDEX_SUCCESSFUL"]

getRuleExecutionFailedRecordsFromLastRun

Retrieves a listing of any records that failed a particular Data Rule or Data Rule Set (its latest execution)

Parameters

projectName string The name of the Information Analyzer project in which the Data Rule or Data Rule Set exists
ruleOrSetName string The name of the Data Rule or Data Rule Set
numRows int The maximum number of rows to retrieve (if unspecified will default to 100)
callback recordsCallback the records that failed

getRuleExecutionResults

Retrieves the statistics of the executions of a particular Data Rule or Data Rule Set

Parameters

projectName string The name of the Information Analyzer project in which the Data Rule or Data Rule Set exists
ruleOrSetName string The name of the Data Rule or Data Rule Set
bLatestOnly boolean If true, returns only the statistics from the latest execution (otherwise full history)
callback statsCallback the statistics of the historical execution(s)

listCallback

This callback is invoked as the result of an IA REST API call, providing the response of that request.

Type: Function

Parameters

errorMessage string any error message, or null if no errors
aResponse Array<string> the response of the request, in the form of an array

requestCallback

This callback is invoked as the result of an IA REST API call, providing the response of that request.

Type: Function

Parameters

errorMessage string any error message, or null if no errors
responseXML string the XML of the response

itemsToIgnoreCallback

This callback is invoked as the result of retrieving a list of items that Information Analyzer should ignore

Type: Function

Parameters

errorMessage string any error message, or null if no errors
typeToIdentities Object dictionary keyed by object type, with each value being an array of objects of that type to ignore (as identity strings, /-delimited)

statsCallback

This callback is invoked as the result of an IA REST API call to retrieve historical statistics on Data Rule executions

Type: Function

Parameters

errorMessage string any error message, or null if no errors
stats Array<Object> an array of stats, each stat being a JSON object with ???

recordsCallback

This callback is invoked as the result of an IA REST API call to retrieve records that failed Data Rules

Type: Function

Parameters

errorMessage string any error message, or null if no errors
records Array<Object> an array of records, each record being a JSON object keyed by column name and with the value of the column for that row
columnMap Object key-value pairs mapping column names to their context (e.g. full identity in the case of database columns like RecordPK)

reindexCallback

This callback is invoked as the result of an IA REST API call to re-index Solr for IATC

Type: Function

Parameters

errorMessage string any error message, or null if no errors
status string the status of the reindex operation ["REINDEX_SUCCESSFUL"]

statusCallback

This callback is invoked as the result of an IA REST API call, providing the response of that request.

Type: Function

Parameters

errorMessage string any error message, or null if no errors
status Object the response of the request, in the form of an object keyed by execution ID, with subkeys for executionTime, progress and status ["running", "successful", "failed", "cancelled"]

columnAnalysisCallback

This callback is invoked as the result of an IA REST API call to execute column analysis.

Type: Function

Parameters

errorMessage string any error message, or null if no errors
tamsToSources Object a dictionary of TAM RIDs to data sources
schedule Object an object containing 'scheduleRids', which is an array of scheduler execution IDs

staleAnalysisCallback

This callback is invoked as the result of an IA REST API call to determine which data sources have not been refreshed within a provided time period.

Type: Function

Parameters

errorMessage string any error message, or null if no errors
projectRID string the RID of the Information Analyzer project
aDataSources Array<string> an array of entries with HOST||DB.SCHEMA.TABLE for database tables and HOST||PATH:FILE for data files, only for those that are stale

dataSourceListCallback

This callback is invoked as the result of an IA REST API call to retrieve a list of data sources within a project.

Type: Function

Parameters

errorMessage string any error message, or null if no errors
projectRID string the RID of the Information Analyzer project
aDataSources Array<string> an array of entries with HOST||DB.SCHEMA.TABLE for database tables and HOST||PATH:FILE for data files

Project

Project class -- for handling Information Analyzer projects

getProjectDoc

Retrieve the Project document

setDescription

Set the description of the project

Parameters

desc

addTable

Add the specified table to the project

Parameters

datasource string the database name
schema string
table string
aColumns Array<string> array of column names

addFile

Add the specified file to the project

Parameters

datasource string the host name?
folder string the full path to the file
file string the name of the file
aFields Array<string> array of field names within the file

ColumnAnalysis

ColumnAnalysis class -- for handling Information Analyzer column analysis tasks

constructor

Parameters

project Project the project in which to create the column analysis task
analyzeColumnProperties boolean whether or not to analyze column properties
captureResultsType string specifies the type of frequency distribution results that are written to the analysis database ["CAPTURE_NONE", "CAPTURE_ALL", "CAPTURE_N"]
minCaptureSize int the minimum number of results that are written to the analysis database, including both typical and atypical values
maxCaptureSize int the maximum number of results that are written to the analysis database
analyzeDataClasses boolean whether or not to analyze data classes

setSampleOptions

Use to (optionally) set any sampling options for the column analysis

Parameters

type string the sampling type ["random", "sequential", "every_nth"]
size number if less than 1.0, the percentage of values to use in the sample; otherwise the maximum number of records in the sample. If you use the "random" type of data sample, specify the sample size that is the same number as the number of records that will be in the result, based on the value that you specify in the Percent field. Otherwise, the results might be skewed.
seed string if type is "random", this value is used to initialize the random generators (two samplings that use the same seed value will contain the same records)
step int if type is "every_nth", this value indicates the step to apply (one row will be kept out of every nth value rows)

setEngineOptions

Use to (optionally) set any engine options to use when running the column analysis

Parameters

retainOSH boolean whether to retain the generated DataStage job or not
retainData boolean whether to retain generated data sets (ignored when data rules are running)
config string specifies an alternative configuration file to use with the DataStage engine during this run
gridEnabled string whether or not the grid view will be enabled
requestedNodes string the name of requested nodes
minNodes string the minimum number of nodes you want in the analysis
partitionsPerNode string the number of partitions for each node in the analysis

setJobOptions

Use to (optionally) set any job options to use when running the column analysis

Parameters

debugEnabled boolean whether to generate a debug table containing the evaluation results of all functions and tests contained in the expression (only used for running data rules)
numDebuggedRecords int how many rows should be debugged, if debugEnabled is "true"
arraySize int the size of the array (?)
autoCommit boolean
isolationLevel int
updateExistingTables boolean whether to update existing tables in IADB or create new ones (only used for column analysis)

addColumn

Use to add a column to the column analysis task -- both table and column can be '*' to specify all tables or all columns

Parameters

datasource string
schema string
table string
column string
hostname string?

addFileField

Use to add a file field to the column analysis task -- column can be '*' to specify all fields within the file

Parameters

connection string e.g. "HDFS"
path string directory path, not including the filename
filename string
column string name of the field within the file
hostname string?

PublishResults

PublishResults class -- for handling Information Analyzer results publishing tasks

constructor

Parameters

project Project the project from which to publish analysis results

addTable

Use to add a table whose results should be published -- the table can be '*' to specify all tables

Parameters

datasource string
schema string
table string
hostname string?

addFile

Use to add a file whose results should be published -- file can be '*' to specify all files

Parameters

connection string e.g. "HDFS"
path string directory path, not including the filename
filename string
hostname string?

ibm-ia-rest

README

ibm-ia-rest

setConnection

makeRequest

getAllItemsToIgnore

addIADBToIgnoreList

createOrUpdateAnalysisProject

getProjectList

getProjectDataSourceList

runColumnAnalysisForDataSources

publishResultsForDataSources

getStaleAnalysisResults

reindexThinClient

getRuleExecutionFailedRecordsFromLastRun

getRuleExecutionResults

listCallback

requestCallback

itemsToIgnoreCallback

statsCallback

recordsCallback

reindexCallback

statusCallback

columnAnalysisCallback

staleAnalysisCallback

dataSourceListCallback

Project

getProjectDoc

setDescription

addTable

addFile

ColumnAnalysis

constructor

setSampleOptions

setEngineOptions

setJobOptions

addColumn

addFileField

PublishResults

constructor

addTable

addFile

Versions

Current Tags

Version History

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Last publish

Collaborators

Weekly Downloads