Narcissistic Piano Mover

    @jackdbd/eleventy-plugin-text-to-speech
    TypeScript icon, indicating that this package has built-in type declarations

    1.1.0 • Public • Published

    @jackdbd/eleventy-plugin-text-to-speech

    npm version Snyk Vulnerabilities for npm package

    Eleventy plugin that synthesizes any text you want, on any page of your Eleventy site, using the Google Cloud Text-to-Speech API. You can either self-host the audio assets this plugin generates, or host them on Cloud Storage.

    ⚠️ The Cloud Text-to-Speech API has a limit of 5000 characters.

    See also:

    Installation

    npm install --save-dev @jackdbd/eleventy-plugin-text-to-speech

    Preliminary Operations

    Enable the Text-to-Speech API

    Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:

    gcloud services enable texttospeech.googleapis.com

    Set up authentication via a service account

    This plugin uses the official Node.js client library for the Text-to-Speech API. In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.

    First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. You just need the service account, no need to configure any IAM permissions.

    gcloud iam service-accounts create sa-text-to-speech-user \
      --display-name "Text-to-Speech user SA"

    Second, download the JSON key of this service account and store it somewhere safe. Do not track this file in git.

    Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)

    Create a Cloud Storage bucket in your desired location. Enable uniform bucket-level access and use the nearline storage class.

    gsutil mb \
      -p $GCP_PROJECT_ID \
      -l $CLOUD_STORAGE_LOCATION \
      -c nearline \
      -b on \
      gs://bkt-eleventy-plugin-text-to-speech-audio-files

    If you want, you can check that uniform bucket-level access is enabled using this command:

    gsutil uniformbucketlevelaccess get \
      gs://bkt-eleventy-plugin-text-to-speech-audio-files

    Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):

    gsutil iam ch allUsers:objectViewer \
      gs://bkt-eleventy-plugin-text-to-speech-audio-files

    Usage

    Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the environment variable CF_PAGES_URL.

    Self-hosting the generated audio assets

    If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:

    const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
    
    module.exports = function (eleventyConfig) {
      // some eleventy configuration...
    
      eleventyConfig.addPlugin(tts, {
        audioHost: process.env.CF_PAGES_URL
          ? new URL(`${process.env.CF_PAGES_URL}/assets/audio`)
          : new URL('http://localhost:8090/assets/audio')
      })
    
      // some more eleventy configuration...
    }

    Hosting the generated audio assets on Cloud Storage

    If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:

    const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
    
    module.exports = function (eleventyConfig) {
      // some eleventy configuration...
    
      eleventyConfig.addPlugin(tts, {
        audioHost: {
          bucketName: 'some-bucket-containing-publicly-readable-files'
        },
        rules: [
          // synthesize the text contained in all <h1> tags, in all posts
          {
            regex: new RegExp('posts\\/.*\\.html$'),
            cssSelectors: ['h1']
          },
          // synthesize the text contained in all <p> tags that start with "Once upon a time", in all HTML pages, except the 404.html page
          {
            regex: new RegExp('^((?!404).)*\\.html$'),
            xPathExpressions: ['//p[starts-with(., "Once upon a time")]']
          }
        ],
        voice: 'en-GB-Wavenet-C'
      })
    
      // some more eleventy configuration...
    }

    Multiple hosts

    If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:

    • self-host some audio assets, and host on a Cloud Storage bucket some other assets
    • host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.

    Have a look at the Eleventy configuration of the demo-site in this monorepo.

    Configuration

    Required parameters

    Parameter Explanation
    audioHost Each audio host should have a matching writer responsible for writing/uploading the assets to the host.

    Options

    Option Default Explanation
    audioEncodings ['OGG_OPUS', 'MP3'] List of audio encodings to use when generating audio assets from text matches.
    audioInnerHTML see in src/dom.ts Function to use to generate the innerHTML of the <audio> tag to inject in the page for each text match.
    cacheExpiration 365d Expiration for the 11ty AssetCache. See here.
    collectionName audio-items Name of the 11ty collection created by this plugin.
    keyFilename process.env.GOOGLE_APPLICATION_CREDENTIALS credentials for the Cloud Text-to-Speech API (and for the Cloud Storage API if you don't set it in audioHost).
    rules see in src/constants.ts Rules that determine which texts to convert into speech.
    transformName inject-audio-tags-into-html Name of the 11ty transform created by this plugin.
    voice en-US-Standard-J Voice to use when generating audio assets from text matches. The Speech-to-Text API supports these voices, and might have different pricing for diffent voices.

    ⚠️ Don't forget to set either keyFilename or the GOOGLE_APPLICATION_CREDENTIALS environment variable on your build server.

    Debug

    This plugin uses the debug library for logging. You can control what's logged using the DEBUG environment variable. For example, if you set your environment variables in a .envrc file, you could do:

    # print all logging statements
    export DEBUG=eleventy-plugin-text-to-speech/*
    
    # print just the logging statements from the dom module and the writers module
    export DEBUG=eleventy-plugin-text-to-speech/dom,eleventy-plugin-text-to-speech/writers
    
    # print all logging statements, except the ones from the dom module and the transforms module
    export DEBUG=eleventy-plugin-text-to-speech/*,-eleventy-plugin-text-to-speech/dom,-eleventy-plugin-text-to-speech/transforms

    Credits

    I had the idea of this plugin while reading the code of the homonym eleventy-plugin-text-to-speech by Larry Hudson. There are a few differences between these plugins, the main one is that this plugin uses the Google Cloud Text-to-Speech API, while Larry's plugin uses the Microsoft Azure Speech SDK.

    Install

    npm i @jackdbd/eleventy-plugin-text-to-speech

    DownloadsWeekly Downloads

    122

    Version

    1.1.0

    License

    MIT

    Unpacked Size

    84.9 kB

    Total Files

    44

    Last publish

    Collaborators

    • jackdbd