node package manager

pure-extract

Pure Extract

Pulls data out of JSON feeds.

Configuration

Just require it, and call extractData(data, schema) and it will return a flat array with all the data you want.

Creating a Schema

Just grab an existing feed from your favourite API, remove all of the leaf nodes that you don't need, and replace all of the actual values with the name you want to give to that value. For example, take this facebook feed item (with a bunch of the attributes already removed for clarity):

  {
    created_time: "2013-10-18T06:52:37+0000",
    from: {
      category: "Travel/leisure",
      category_list: Array[2],
      id: "123456789123456",
      name: "Fake Data, Inc."
    },
    id: "123456789123456_629571654055491",
    privacy: {},
    story: ""What a beautiful photo" on their own photo.",
    type: "status",
    updated_time: "2013-10-18T06:52:37+0000"
  }

And turn it into this schema:

  {
    created_time: "time",
    from: {
      id: "from_id",
      name: "from_name"
    },
    id: "id",
    story: "content",
    type: "type"
  }

Then calling extractData(arrayOfItemsFromFacebookApi, schema) will give you something like:

  {
    time: "2013-10-18T06:52:37+0000",
    from_id: "123456789123456",
    from_name: "Fake Data, Inc."
    id: "123456789123456_629571654055491",
    content: ""What a beautiful photo" on their own photo.",
    type: "status",
  }

Which you can take to the bank! Or some fancy templating engine. Or whatever your heart desires!

Helper Functions

But that's not all! If, instead of using the value 'date' in the schema, you use the value 'date|formatDate', then Pure Extract will look for a function called 'formatDate' and run the value through that. Currently, there are a couple functions which are written right into the module.