@activediscourse/podcast-parser

1.0.0 • Public • Published

podcast-parser

Parse XML podcast RSS feeds into standardized objects.

installation

yarn add @activediscourse/podcast-parser

usage

Pass a string containing XML source:

const parsePodcast = require("@activediscourse/podcast-parser")

parsePodcast("<podcast xml>")
  .then(feed => console.log(feed))
  .catch(e => console.error(e))

This library only handles parsing, so you'll need to fetch the feed separately first. For example, using node-fetch (or fetch in the browser):

const fetch = require("node-fetch")
const parsePodcast = require("@activediscourse/podcast-parser")

;(async () => {
  const response = await fetch("https://pinecast.com/feed/activediscourse")
  const xml = await response.text()
  const feed = await parsePodcast(xml)

  return feed
})()
  .then(feed => console.log(feed))
  .catch(e => console.error(e))

output format

The output is opinionated with the goal of normalizing results across feeds:

{
  "title": "<Podcast title>",
  "description": {
    "short": "<Podcast subtitle>",
    "long": "<Podcast description>"
  },
  "link": "<Podcast link (usually website for podcast)>",
  "image": "<Podcast image>",
  "language": "<ISO 639 language>",
  "copyright": "<Podcast copyright>",
  "updated": "<pubDate or latest episode pubDate>",
  "explicit": "<Podcast is explicit, true/false>",
  "categories": [
    "Category>Subcategory"
  ],
  "author": "<Author name>",
  "owner": {
    "name":  "<Owner name>",
    "email": "<Owner email>"
  },
  "episodes": [
    {
      "guid": "<Unique id>",
      "title": "<Episode title>",
      "subtitle": "<Episode subtitle>",
      "description": "<Episode description>",
      "rawDescription": "<Episode description stripped of HTML tags>",
      "explicit": "<Episode is is explicit, true/false>",
      "image": "<Episode image>",
      "published": "<date>",
      "duration": 120,
      "categories": [
        "Category"
      ],
      "enclosure": {
        "filesize": 5650889,
        "type": "audio/mpeg",
        "url": "<mp3 file>"
      }
    }
  ]
}

notes

language

Many podcasts have the language set something like en. A best effort attempt is made to normalize language strings to an IETF language code, so for example en will be converted to en-us. Non-English languages will be presented for example as de-DE.

normalization

Not all feeds can be guaranteed to contain all properties, so they are simply ommited from the output in that case.

Episode categories are included as an empty array if the podcast isn't assigned any categories.

Episodes are sorted in descending order by publish date.

development

  1. Clone the repo: git clone https://github.com/activediscourse/podcast-parser.git
  2. Move into the new directory: cd podcast-parser
  3. Install dependencies: yarn
  4. Build the source: yarn build
  5. Run tests: yarn test

license

MIT © Bo Lingen / citycide

Based on node-podcast-parser, also MIT, © Antti Kupila.

See license

Package Sidebar

Install

npm i @activediscourse/podcast-parser

Weekly Downloads

1

Version

1.0.0

License

MIT

Unpacked Size

13 kB

Total Files

5

Last publish

Collaborators

  • haltcase
  • citycide