This package has been deprecated

Author message:

WARNING: This project has been renamed to @laconia/batch. Install using @laconia/batch to get the latest version.

laconia-batch

0.4.0 • Public • Published

laconia-batch

CircleCI Coverage Status Apache License

🛡️ Laconia Batch — Reads large number of records without time limit.

Reads large number of records without Lambda time limit.

AWS Lambda maximum execution duration per request is 300 seconds, hence it is impossible to utilise a Lambda to execute a long running task. laconia-batch handles your batch processing needs by providing a beautifully designed API which abstracts the time limitaton problem.

FAQ

Check out FAQ

Usage

Install laconia-batch using yarn:

yarn add laconia-batch

Or via npm:

npm install --save laconia-batch

These are the currently supported input sources:

  • DynamoDB
  • S3

Example of batch processing by scanning a dynamodb table:

const laconiaBatch = require("laconia-batch");

module.exports.handler = laconiaBatch(
  _ =>
    laconiaBatch.dynamoDb({
      operation: "SCAN",
      dynamoDbParams: { TableName: "Music" }
    }),
  { itemsPerSecond: 2 }
).on("item", ({ event }, item) => processItem(event, context));

Rate limiting is supported out of the box by setting the batchOptions.itemsPerSecond option.

How it works

laconia-batch works around the Lambda's time limitation by using recursion. It will automatically recurse when Lambda timeout is about to happen, then resumes from where it left off in the new invocation.

Imagine if you are about to process the array [1, 2, 3, 4, 5] and each requests can only handle two items, the following will happen:

  • request 1: Process 1
  • request 1: Process 2
  • request 1: Not enough time, recursing with current cursor
  • request 2: Process 3
  • request 2: Process 4
  • request 2: Not enough time, recursing with current cursor
  • request 3: Process 5

API

laconiaBatch(readerFn, batchOptions)

  • readerFn(laconiaContext)
    • This Function is called when your Lambda is invoked
    • The function must return a reader object i.e. dynamoDb(), s3()
    • Will be called with laconiaContext object, which can be destructured to {event, context}
  • batchOptions
    • itemsPerSecond
      • Optional
      • Rate limit will not be applied if value is not set
      • Can be set to decimal, i.e. 0.5 will equate to 1 item per 2 second.
    • timeNeededToRecurseInMillis
      • Optional
      • The value set here will be used to check if the current execution is to be stopped
      • If you have a very slow item processing, the batch processor might not have enough time to recurse and your Lambda execution might be timing out. You can increase this value to increase the chance of the the recursion to happen

Example:

// Use all default batch options (No rate limiting)
laconiaBatch(_ => dynamoDb());

// Customise batch options
laconiaBatch(_ => dynamoDb(), {
  itemsPerSecond: 2,
  timeNeededToRecurseInMillis: 10000
});

Events

There are events that you can listen to when laconia-batch is working.

  • item: laconiaContext, item
    • Fired on every item read.
    • item is an object found during the read
    • laconiaContext can be destructured to {event, context}
  • start: laconiaContext
    • Fired when the batch process is started for the very first time
    • laconiaContext can be destructured to {event, context}
  • stop: laconiaContext, cursor
    • Fired when the current execution is timing out and about to be recursed
    • cursor contains the information of how the last item is being read
    • laconiaContext can be destructured to {event, context}
  • end: laconiaContext
    • Fired when the batch processor can no longer find any more records
    • laconiaContext can be destructured to {event, context}

Example:

laconiaBatch({ ... })
.on('start', (laconiaContext) => ... )
.on('item', (laconiaContext, item) => ... )
.on('stop', (laconiaContext, cursor) => ... )
.on('end', (laconiaContext) => ... )

dynamoDb(readerOptions)

Creates a reader for Dynamo DB table.

  • operation
    • Mandatory
    • Valid values are: 'SCAN' and 'QUERY'
  • dynamoDbParams
    • Mandatory
    • This parameter is used when documentClent's operations are called
    • ExclusiveStartKey param can't be used as it will be overridden in the processing time!
  • documentClient = new AWS.DynamoDB.DocumentClient()
    • Optional
    • Set this option if there's a need to cutomise the AWS.DynamoDB.DocumentClient instantation
    • Used for DynamoDB operation

Example:

// Scans the entire Music table
dynamoDb({
  operation: "SCAN",
  dynamoDbParams: { TableName: "Music" }
});

// Queries Music table with a more complicated DynamoDB parameters
dynamoDb({
  operation: "QUERY",
  dynamoDbParams: {
    TableName: "Music",
    Limit: 1,
    ExpressionAttributeValues: {
      ":a": "Bar"
    },
    FilterExpression: "Artist = :a"
  }
});

s3(readerOptions)

Creates a reader for an array stored in s3.

  • path
    • Mandatory
    • The path to the array to be processed
    • Set to '.' if the object stored in s3 is the array
    • Set to a path if an object is stored in s3 and the array is a property of the object
      • lodash.get is used to retrieve the array
  • s3Params
    • Mandatory
    • This parameter is used when s3.getObject is called to retrieve the array stored in s3
  • s3 = new AWS.S3()
    • Optional
    • Set this option if there's a need to cutomise the AWS.S3 instantation
    • Used for S3 operation

Example:

// Reads an array from array.json in MyBucket
s3({
  path: ".",
  s3Params: {
    Bucket: "MyBucket",
    Key: "array.json"
  }
});

// Reads the array retrieved at database.music[0]["category"].list from object.json in MyBucket
s3({
  path: 'database.music[0]["category"].list',
  s3Params: {
    Bucket: "MyBucket",
    Key: "object.json"
  }
});

Package Sidebar

Install

npm i laconia-batch

Weekly Downloads

0

Version

0.4.0

License

Apache-2.0

Unpacked Size

14 kB

Total Files

10

Last publish

Collaborators

  • ceilfors