@datawheel/cube-audit
TypeScript icon, indicating that this package has built-in type declarations

1.0.2 • Public • Published

@datawheel/cube-audit

A quick script to audit metainformation annotations on cubes from tesseract-olap servers.

Description

This script will analyze the cubes in a given tesseract endpoint, and produce a markdown checklist of suggested annotations and dimension types to better improve the performance and user experience for vizbuilder, tesseract-explorer and it's plugins, and other products from Datawheel.

Usage

The easiest usage is simply running the cube-audit script followed by a base tesseract endpoint:

npx @datawheel/cube-audit https://api.oec.world/tesseract/

No installation needed. npx will take care of downloading the script and running it.

The CLI interface supports some additional options. Run npx @datawheel/cube-audit --help for more details.

You can also add it as a dependency on another project and use the auditServer exported function:

const {auditServer} = require("@datawheel/cube-audit");

auditServer("https://api.oec.world/tesseract/").then(result => { ... });

The auditServer function accepts an url string or an AxiosRequestConfig object. Optionally, it supports a second parameter for a callback function that is run for each cube with the resulting issues; setting this parameter will also run each cube one after another instead of in parallel.
The result of the audit is an object you can interpret easily to do other actions.

Error: "JavaScript heap out of memory"

You might find a cube that has one or many levels with thousands or millions of members. In these cases is highly probable you will run out of memory, because V8's memory usage is by default limited to around 1.7GB. To increase this limit, set the environment variable NODE_OPTIONS="--max_old_space_size=10000" (where 10000 would mean 10GB, change accordingly), and run the script again.

Annotations

Cube

Annotation Name Description
source_name Organization that produces/published the data (ex. "Census Bureau" or "BACI").
source_link Web address for the organization (will turn source_name display into an anchor link).
source_description Description of the source organization (typically a few short sentences).
dataset_name Title for the specific dataset/table (ex. "ACS 1-Year Estimate" or "HS6 REV. 1992 (1995 - 2018)").
dataset_link Web address for the dataset (will turn dataset_name display into an anchor link).
dataset_description Description of the dataset (typically a few short sentences).

Measures

Annotation Name Description
aggregation_method The method by which a measure should be aggregated (if the root aggregation type is unknown).

Valid values include: COUNT, SUM, AVERAGE, MEDIAN, RCA.
format_template A template string that specifies how the numeric values of a measure should be displayed to users. Can be any valid d3plus-format string specifier, which extends the base specifiers defined by d3-format.

Defaults to ".3~a", which abbreviates large numbers and adds the appropriate suffix (ie. 1234567890 becomes 1.23B).
description The text description of a measure, typically 1-3 short sentences.

Locale Support

The annotations mentioned here are the default values. However, if you need texts for different languages, you can add _<locale> as a suffix. For example, source_name for English and Spanish should be source_name_en and source_name_es. Applications will then try to use the annotation with suffix, and if it's not available, will default to the annotation without suffix.

Package Sidebar

Install

npm i @datawheel/cube-audit

Weekly Downloads

2

Version

1.0.2

License

MIT

Unpacked Size

21.2 kB

Total Files

7

Last publish

Collaborators

  • palamago
  • scespinoza
  • davelandry
  • frabarz
  • jhmullen-dw