frictionless.js

frictionless.js is a lightweight, standardized "stream-plus-metadata" interface for accessing files and datasets, especially tabular ones (CSV, Excel).

frictionless.js follows the "Frictionless Data Lib Pattern".

Open it fast: simple open method for data on disk, online and inline
Data plus: data plus metadata (size, path, etc) in standardized way
Stream it: raw streams and object streams
Tabular: open CSV, Excel or arrays and get a row stream
Frictionless: compatible with Frictionless Data standards

A line of code is worth a thousand words ...

const {open} = require('frictionless.js')

var file = open('path/to/ons-mye-population-totals.xls')

file.descriptor
  {
    path: '/path/to/ons-mye-population-totals.xls',
    pathType: 'local',
    name: 'ons-mye-population-totals',
    format: 'xls',
    mediatype: 'application/vnd.ms-excel',
    encoding: 'windows-1252'
  }

file.size
  67584

file.rows() => stream object for rows
  // keyed by header row by default ...
  { 'col1': 1, 'col2': 2, ... }
  { 'col1': 10, 'col2': 20, ... }

Table of Contents

Motivation
Features
Installation
Browser
Usage
API
Developers

Motivation

frictionless.js is motivated by the following use cases:

Data "plus": when you work with data you always find yourself needing the data itself plus a little bit more -- things like where the data came from on disk (or is going to), or how large it is. This library gives you that information in a standardized way.
Convenient open: the same simple open method whether you are accessing data on disk, from a URL or inline data from a string, buffer or array.
Streams (or strings): standardized iterator / object stream interface to data wherever you've loaded from online, on disk or inline
Building block for data pipelines: provides a standardized building block for more complex data processing. For example, suppose you want to load a csv file then write to JSON. That's simple enough. But then suppose you want to delete the first 3 rows, delete the 2nd column. Now you have a more complex processing pipeline. This library provides a simple set of "data plus metadata" objects that you can pass along your pipeline.

Features

Easy: a single common API for local, online and inline data
Micro: The whole project is ~400 lines of code
Simple: Oriented for single purpose
Explicit: No hidden behaviours, no extra magic
Frictionlesss: uses and supports (but does not require) Frictionless Data specs such as Data Package so you can leverage Frictionless tooling
Minimal glue: Use on its own or as a building block for more complex data tooling (thanks to its common minimal metadata)

Installation

npm install frictionless.js

Browser

If you want to use the it in the browser, first you need to build the bundle.

Run the following command to generate the bundle for the necessary JS targets

yarn build

This will create two bundles in the dist folder. node sub-folder contains build for node environment, while browser sub-folder contains build for the browser. In a simple html file you can use it like this:

<head>
  <script src="./dist/browser/bundle.js"></script>
  <script>
    // Global data lib is available here...
    
    const file = data.open('path/to/file')
    ...
  </script>
</head>
<body></body>

Usage

With a simple file:

const data = require('frictionless.js')

// path can be local or remote
const file = data.open(path)

// descriptor with metadata e.g. name, path, format, (guessed) mimetype etc
console.log(file.descriptor)

// returns promise with raw stream
const stream = await file.stream()

// let's get an object stream of the rows
// (assuming it is tabular i.e. csv, xls etc)
const rows = await file.rows()

// entire file as a buffer
const buffer = await file.buffer

//for large files you can return in chunks
await file.bufferInChunks((chunk, progress)=>{
  console.log(progress, chunk)
})

With a Dataset:

const { Dataset } = require('frictionless.js')

const path = '/path/to/directory/' // must have datapackage.json in the directory atm

Dataset.load(path).then(dataset => {
  // get a data file in this dataset
  const file = dataset.resources[0]

  const data = file.stream()
})

API

See API documentation here

Developers

Requirements:

NodeJS >= v8.10.0
NPM >= v5.2.0

Test

We have two type of tests Karma based for browser testing and Mocha with Chai for Node. All node tests are in datajs/test folder. Since Mocha is sensitive to test namings, we have separate the folder /browser-test for only Karma.

To run browser test, first you need to build the library in order to have the bundle in dist/browser folder. Run: yarn build:browser to achieve this, then for browser testing use the command yarn test:browser, this will run Karma tests.
To test in Node: yarn test:node
To run all tests including Node and browser run yarn test
To watch Node test run: yarn test:node:watch

Setup

Git clone the repo
Install dependencies: yarn
To make the browser and node test work, first run the build: yarn build
Run tests: yarn test
Do some dev work
Once done, make sure tests are passing. Then build distribution version of the app - yarn build.

Run yarn build to compile using webpack and babel for different node and web target. To watch the build run: yarn build:watch.
Now proceed to "Deployment" stage

Deployment

Update version number in package.json.
Git commit: git commit -m "some message, eg, version".
Release: git tag -a v0.12.0 -m "some message".
Push: git push origin master --tags
Publish to NPM: npm publish

frictionless.js

Motivation

Features

Installation

Browser

Usage

API

Developers

Test

Setup

Deployment

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

frictionless.js

Motivation

Features

Installation

Browser

Usage

API

Developers

Test

Setup

Deployment

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads