    Serverless Glue

    This is a plugin for Serverless framework that provide the posibliti to deploy AWS Glue Jobs


    1. run npm install --save-dev serverless-glue
    2. add serverless-glue in serverless.yml plugin section
          - serverless-glue

    How work

    The plugin create CloufFormation resources of your configuration before make the serverless deploy then add it to the serverless template.

    So any glue-job deployed with this plugin is part of your stack too.

    How configure your GlueJobs

    Configure yours glue jobs in custom section like this:

        bucketDeploy: someBucket # Required
        s3Prefix: some/s3/key/location/ # optional, default = 'glueJobs/'
        tempDirBucket: someBucket # optional, default = '{serverless.serviceName}-{provider.stage}-gluejobstemp' 
        tempDirS3Prefix: some/s3/key/location/ # optional, default = ''. The job name will be appended to the prefix name
          - job:
              name: super-glue-job # Required
              script: src/glueJobs/ # Required script will be named with the name after '/' and uploaded to s3Prefix location
              tempDir: true # Optional true | false
              type: spark # spark / pythonshell # Required
              glueVersion: python3-2.0 # Required python3-1.0 | python3-2.0 | python2-1.0 | python2-0.9 | scala2-1.0 | scala2-0.9 | scala2-2.0 
              role: arn:aws:iam::000000000:role/someRole # Required
              MaxConcurrentRuns: 3 # Optional
              WorkerType: Standard  # Optional  | Standard  | G1.X | G2.X
              NumberOfWorkers: 1 # Optional
          - trigger:
              name: some-trigger-name # Required
              schedule: 30 12 * * ? * # Optional, CRON expression. The trigger will be created with On-Demand type if the schedule is not provided.
              jobs: # Required. One or more jobs to trigger
                - job:
                    name: super-glue-job # Required
                    args: # optional
                      --arg1: value1
                      --arg2: value2
                    timeout: 30 # optional
                - job:
                    name: another-glue-job

    you can define a lot of jobs..

        bucketDeploy: someBucket
            - job:
            - job:

    And a lot of triggers..

            - trigger:
            - trigger:

    Glue configuration parameters

    Parameter Type Description Required
    bucketDeploy String S3 Bucket name true
    s3Prefix String S3 prefix name false
    tempDirBucket String S3 Bucket name for Glue temporary directory. If dont pass argument the bucket'name will generates with pattern {serverless.serviceName}-{provider.stage}-gluejobstemp false
    tempDirS3Prefix String S3 prefix name for Glue temporary directory false
    jobs Array Array of glue jobs to deploy true

    Jobs configurations parameters

    Parameter Type Description Required
    name String name of job true
    script String script path in the project true
    tempDir Boolean flag indicate if job required a temp folder, if true plugin create a bucket for tmp false
    type String Indicate if the type of your job. Values can use are : spark or pythonshell true
    glueVersion String Indicate language and glue version to use ( [language][version]-[glue version]) the value can you use are:
    • python3-1.0
    • python3-2.0
    • python2-1.0
    • python2-0.9
    • scala2-1.0
    • scala2-0.9
    • scala2-2.0
    role String arn role to execute job true
    MaxConcurrentRuns Double max concurrent runs of the job false
    WorkerType String The type of predefined worker that is allocated when a job runs. Accepts a value of Standard, G.1X, or G.2X. false
    NumberOfWorkers Integer number of workers false

    Triggers configuration parameters

    Parameter Type Description Required
    name String name of the trigger true
    schedule String CRON expression false
    jobs Array An array of jobs to trigger true

    Only On-Demand and Scheduled triggers are supported.

    Trigger job configuration parameters

    Parameter Type Description Required
    name String The name of the Glue job to trigger true
    timeout Integer Job execution timeout false
    args Map job arguments false

    And now?...

    Only run serverless deploy


