Noiseless Party Machine

    ibm-ia-rest

    0.3.0 • Public • Published

    README

    ibm-ia-rest

    Re-usable functions for interacting with Information Analyzer's REST API

    Examples

    // runs column analysis for any objects in Automated Profiling that have not been analyzed since the moment the script is run (new Date())
    var iarest = require('ibm-ia-rest');
    var commons = require('ibm-iis-commons');
    var restConnect = new commons.RestConnection("isadmin", "isadmin", "hostname", "9445");
    iarest.setConnection(restConnect);
    iarest.getStaleAnalysisResults("Automated Profiling", new Date(), function(errStale, projectRID, aStaleSources) {
      iarest.runColumnAnalysisForDataSources(projectRID, aStaleSources, function(errExec, tamsAnalyzed) {
        // Note that the API returns async; if you want to busy-wait you need to poll events on Kafka
      });
    });

    Meta

    • license: Apache-2.0

    setConnection

    Set the connection for the REST API

    Parameters

    • restConnect RestConnection RestConnection object, from ibm-iis-commons

    makeRequest

    Make a request against IA's REST API

    Parameters

    • method string type of request, one of ['GET', 'PUT', 'POST', 'DELETE']

    • path string the path to the end-point (e.g. /ibm/iis/ia/api/...)

    • input string? any input for the request, i.e. for PUT, POST

    • inputType string? the type of input, if any provided ['text/xml', 'application/json']

    • callback requestCallback callback that handles the response

    • Throws any will throw an error if connectivity details are incomplete or there is a fatal error during the request

    getAllItemsToIgnore

    Retrieves a list of all items that should be ignored, i.e. where they are labelled with "Information Analyzer Ignore List"

    Parameters

    addIADBToIgnoreList

    Adds the IADB schema to a list of objects for Information Analyzer to ignore (to prevent them being added to projects or being analysed); this is accomplished by creating a label 'Information Analyzer Ignore List'

    Parameters

    createOrUpdateAnalysisProject

    Create or update an analysis project, to include ALL objects known to IGC that were updated after the date received -- necessary before any tasks can be executed

    Parameters

    • name string name of the project
    • description string description of the project
    • updatedAfter Date? include into the project any objects in IGC last updated after this date
    • callback requestCallback callback that handles the response

    getProjectList

    Get a list of Information Analyzer projects

    Parameters

    getProjectDataSourceList

    Get a list of all of the data sources in the specified Information Analyzer project

    Parameters

    • projectName string
    • callback dataSourceListCallback callback that handles the response (will be entries with HOST||DB.SCHEMA.TABLE and HOST||PATH:FILE)

    runColumnAnalysisForDataSources

    Run a full column analysis against the list of data sources specificed (based on TAM RIDs)

    Parameters

    • projectRID string the RID of the project in which to execute the analysis
    • aDataSources Array<Object> an array of data sources, as returned by getProjectDataSourceList
    • callback columnAnalysisCallback callback that handles the response

    publishResultsForDataSources

    Publish analysis results for the list of data sources specified

    Parameters

    • projectRID string RID of the IA project
    • aTAMs Array<string> an array of TAM RIDs whose analysis should be published
    • callback requestCallback callback that handles the response

    getStaleAnalysisResults

    Retrieve previously published analysis results

    Parameters

    • projectName string name of the IA project
    • timeToConsiderStale Date the time before which any analysis results should be considered stale
    • callback staleAnalysisCallback callback that handles the response

    reindexThinClient

    Issues a request to reindex Solr for any resutls to appear appropriately in the IA Thin Client

    Parameters

    • batchSize int The batch size to retrieve information from the database. Increasing this size may improve performance but there is a possibility of reindex failure. The default is 25. The maximum value is 1000.
    • solrBatchSize int The batch size to use for Solr indexing. Increasing this size may improve performance. The default is 100. The maximum value is 1000.
    • upgrade boolean Specifies whether to upgrade the index schema from a previous version, and is a one time requirement when upgrading from one version of the thin client to another. The schema upgrade can be used to upgrade from any previous version of the thin client. The value true will upgrade the index schema. The value false is the default, and will not upgrade the index schema.
    • force boolean Specifies whether to force reindexing if indexing is already in process. The value true will force a reindex even if indexing is in process. The value false is the default, and prevents a reindex if indexing is already in progress. This option should be used if a previous reindex request is aborted for any reason. For example, if InfoSphere Information Server services tier system went offline, you would use this option.
    • callback reindexCallback status of the reindex ["REINDEX_SUCCESSFUL"]

    getRuleExecutionFailedRecordsFromLastRun

    Retrieves a listing of any records that failed a particular Data Rule or Data Rule Set (its latest execution)

    Parameters

    • projectName string The name of the Information Analyzer project in which the Data Rule or Data Rule Set exists
    • ruleOrSetName string The name of the Data Rule or Data Rule Set
    • numRows int The maximum number of rows to retrieve (if unspecified will default to 100)
    • callback recordsCallback the records that failed

    getRuleExecutionResults

    Retrieves the statistics of the executions of a particular Data Rule or Data Rule Set

    Parameters

    • projectName string The name of the Information Analyzer project in which the Data Rule or Data Rule Set exists
    • ruleOrSetName string The name of the Data Rule or Data Rule Set
    • bLatestOnly boolean If true, returns only the statistics from the latest execution (otherwise full history)
    • callback statsCallback the statistics of the historical execution(s)

    listCallback

    This callback is invoked as the result of an IA REST API call, providing the response of that request.

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • aResponse Array<string> the response of the request, in the form of an array

    requestCallback

    This callback is invoked as the result of an IA REST API call, providing the response of that request.

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • responseXML string the XML of the response

    itemsToIgnoreCallback

    This callback is invoked as the result of retrieving a list of items that Information Analyzer should ignore

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • typeToIdentities Object dictionary keyed by object type, with each value being an array of objects of that type to ignore (as identity strings, /-delimited)

    statsCallback

    This callback is invoked as the result of an IA REST API call to retrieve historical statistics on Data Rule executions

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • stats Array<Object> an array of stats, each stat being a JSON object with ???

    recordsCallback

    This callback is invoked as the result of an IA REST API call to retrieve records that failed Data Rules

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • records Array<Object> an array of records, each record being a JSON object keyed by column name and with the value of the column for that row
    • columnMap Object key-value pairs mapping column names to their context (e.g. full identity in the case of database columns like RecordPK)

    reindexCallback

    This callback is invoked as the result of an IA REST API call to re-index Solr for IATC

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • status string the status of the reindex operation ["REINDEX_SUCCESSFUL"]

    statusCallback

    This callback is invoked as the result of an IA REST API call, providing the response of that request.

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • status Object the response of the request, in the form of an object keyed by execution ID, with subkeys for executionTime, progress and status ["running", "successful", "failed", "cancelled"]

    columnAnalysisCallback

    This callback is invoked as the result of an IA REST API call to execute column analysis.

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • tamsToSources Object a dictionary of TAM RIDs to data sources
    • schedule Object an object containing 'scheduleRids', which is an array of scheduler execution IDs

    staleAnalysisCallback

    This callback is invoked as the result of an IA REST API call to determine which data sources have not been refreshed within a provided time period.

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • projectRID string the RID of the Information Analyzer project
    • aDataSources Array<string> an array of entries with HOST||DB.SCHEMA.TABLE for database tables and HOST||PATH:FILE for data files, only for those that are stale

    dataSourceListCallback

    This callback is invoked as the result of an IA REST API call to retrieve a list of data sources within a project.

    Type: Function

    Parameters

    • errorMessage string any error message, or null if no errors
    • projectRID string the RID of the Information Analyzer project
    • aDataSources Array<string> an array of entries with HOST||DB.SCHEMA.TABLE for database tables and HOST||PATH:FILE for data files

    Project

    Project class -- for handling Information Analyzer projects

    getProjectDoc

    Retrieve the Project document

    setDescription

    Set the description of the project

    Parameters

    • desc

    addTable

    Add the specified table to the project

    Parameters

    addFile

    Add the specified file to the project

    Parameters

    • datasource string the host name?
    • folder string the full path to the file
    • file string the name of the file
    • aFields Array<string> array of field names within the file

    ColumnAnalysis

    ColumnAnalysis class -- for handling Information Analyzer column analysis tasks

    constructor

    Parameters

    • project Project the project in which to create the column analysis task
    • analyzeColumnProperties boolean whether or not to analyze column properties
    • captureResultsType string specifies the type of frequency distribution results that are written to the analysis database ["CAPTURE_NONE", "CAPTURE_ALL", "CAPTURE_N"]
    • minCaptureSize int the minimum number of results that are written to the analysis database, including both typical and atypical values
    • maxCaptureSize int the maximum number of results that are written to the analysis database
    • analyzeDataClasses boolean whether or not to analyze data classes

    setSampleOptions

    Use to (optionally) set any sampling options for the column analysis

    Parameters

    • type string the sampling type ["random", "sequential", "every_nth"]
    • size number if less than 1.0, the percentage of values to use in the sample; otherwise the maximum number of records in the sample. If you use the "random" type of data sample, specify the sample size that is the same number as the number of records that will be in the result, based on the value that you specify in the Percent field. Otherwise, the results might be skewed.
    • seed string if type is "random", this value is used to initialize the random generators (two samplings that use the same seed value will contain the same records)
    • step int if type is "every_nth", this value indicates the step to apply (one row will be kept out of every nth value rows)

    setEngineOptions

    Use to (optionally) set any engine options to use when running the column analysis

    Parameters

    • retainOSH boolean whether to retain the generated DataStage job or not
    • retainData boolean whether to retain generated data sets (ignored when data rules are running)
    • config string specifies an alternative configuration file to use with the DataStage engine during this run
    • gridEnabled string whether or not the grid view will be enabled
    • requestedNodes string the name of requested nodes
    • minNodes string the minimum number of nodes you want in the analysis
    • partitionsPerNode string the number of partitions for each node in the analysis

    setJobOptions

    Use to (optionally) set any job options to use when running the column analysis

    Parameters

    • debugEnabled boolean whether to generate a debug table containing the evaluation results of all functions and tests contained in the expression (only used for running data rules)
    • numDebuggedRecords int how many rows should be debugged, if debugEnabled is "true"
    • arraySize int the size of the array (?)
    • autoCommit boolean
    • isolationLevel int
    • updateExistingTables boolean whether to update existing tables in IADB or create new ones (only used for column analysis)

    addColumn

    Use to add a column to the column analysis task -- both table and column can be '*' to specify all tables or all columns

    Parameters

    addFileField

    Use to add a file field to the column analysis task -- column can be '*' to specify all fields within the file

    Parameters

    • connection string e.g. "HDFS"
    • path string directory path, not including the filename
    • filename string
    • column string name of the field within the file
    • hostname string?

    PublishResults

    PublishResults class -- for handling Information Analyzer results publishing tasks

    constructor

    Parameters

    • project Project the project from which to publish analysis results

    addTable

    Use to add a table whose results should be published -- the table can be '*' to specify all tables

    Parameters

    addFile

    Use to add a file whose results should be published -- file can be '*' to specify all files

    Parameters

    Install

    npm i ibm-ia-rest

    DownloadsWeekly Downloads

    3

    Version

    0.3.0

    License

    Apache-2.0

    Last publish

    Collaborators

    • cgrote