@rowanmanning/feed-parser
A well-tested and resilient Node.js parser for RSS and Atom feeds.
Table of Contents
Introduction
This is a Node.js-based feed parser for RSS and Atom feeds. The project has the following aims:
-
Run automated tests against real-world feeds. It's currently tested against ~40 feeds via Sample Feeds. This ensures that we support real feeds rather than just the specifications.
-
Related to the point above, be as lenient as possible with feed parsing.
-
Keep up to date with the latest Node.js versions, including dropping support for end-of-life versions.
-
Maintain compatibility with the great parts of node-feedparser, e.g. resolving relative URLs.
Requirements
This library requires the following to run:
- Node.js 18+
Usage
Install with npm:
npm install @rowanmanning/feed-parser
Load the library into your code with a require
call:
const parseFeed = require('@rowanmanning/feed-parser');
You can use the parseFeed
function to parse an RSS or Atom feed as a string. The return value is an object representation of the feed:
const feed = parseFeed('<channel> etc. </channel>');
console.log(feed.title);
This will try to parse even invalid feeds, but if no data can be pulled out an error will be thrown with a code
property set to INVALID_FEED
.
Parsed feed
The feed
object returned by parseFeed
has the following properties.
Feed
Represents an RSS or Atom feed.
Property | Type | Notes |
---|---|---|
authors |
FeedAuthor[] |
The feed authors. Always an array but sometimes empty if no authors are found. |
categories |
FeedCategory[] |
The feed categories. Always an array but sometimes empty if no categories are found. |
copyright |
string | null |
The feed's copyright notice. |
description |
string | null |
A short description of the feed. |
generator |
FeedGenerator | null |
The software that generated the feed. |
image |
FeedImage | null |
An image representing the feed. |
items |
FeedItem[] |
The content items in the feed. Always an array but sometimes empty if no items are found. |
language |
string | null |
The language the feed is written in. |
meta |
FeedMeta |
Meta information about the format of the feed. |
published |
Date | null |
The date the feed was last published. |
self |
string | null |
A URL pointing to the feed itself. |
title |
string | null |
The name of the feed. |
updated |
Date | null |
The date the feed was last updated at. |
url |
string | null |
A URL pointing to the HTML web page that this feed is for. |
FeedAuthor
Represents the author of a Feed
or FeedItem
.
Property | Type | Notes |
---|---|---|
email |
string | null |
The author's email address. |
name |
string | null |
The author's name. |
url |
string | null |
A URL pointing to a representation of the author on the internet. |
FeedCategory
Represents the content category of a Feed
or FeedItem
.
Property | Type | Notes |
---|---|---|
label |
string | null |
The category display label. |
term |
string | null |
The category identifier. Often the same as the label . |
url |
string | null |
A URL pointing to a representation of the category on the internet. |
FeedGenerator
Represents software that generated a Feed
.
Property | Type | Notes |
---|---|---|
label |
string | null |
The name of the software that generated the feed. |
url |
string | null |
A URL pointing to further information about the generator. |
version |
string | null |
The version of the software used to generate the feed. |
FeedImage
Represents an image for a Feed
.
Property | Type | Notes |
---|---|---|
title |
string | null |
The alternative text of the image. |
url |
string |
The image URL. |
FeedItem
Represents an RSS item or Atom entry in a Feed
.
Property | Type | Notes |
---|---|---|
authors |
FeedAuthor[] |
The feed item authors. Always an array but sometimes empty if no authors are found. |
categories |
FeedCategory[] |
The feed item categories. Always an array but sometimes empty if no categories are found. |
content |
string | null |
The feed item content. |
description |
string | null |
A short description of the feed item. |
id |
string | null |
A unique identifier for the feed item. |
image |
FeedImage | null |
An image representing the feed item. |
media |
FeedItemMedia[] |
Media associated with the feed item. Always an array but sometimes empty if no items are found. |
published |
Date | null |
The date the feed item was last published. |
title |
string | null |
The title of the feed item. |
updated |
Date | null |
The date the feed item was last updated at. |
url |
string | null |
A URL pointing to the HTML web page that this feed item represents. |
FeedItemMedia
Represents a piece of media attached to a FeedItem
.
Property | Type | Notes |
---|---|---|
image |
string | null |
A URL pointing to an image representation of the media. E.g. a video cover image. |
length |
number | null |
A length of the media in bytes. |
mimetype |
string | null |
The full mime type of the media (e.g. `image/jpeg`). |
title |
string | null |
The title of the media. |
type |
string | null |
The type of the media (the first part of the mime type, e.g. `audio` or `image`). |
url |
string |
A URL pointing to the media. |
FeedMeta
Represents meta information about a Feed
.
Property | Type | Notes |
---|---|---|
type |
"atom" | "rss" |
The name of the type of feed. |
version |
"0.3" | "0.9" | "1.0" | "2.0" |
The version of the type of feed. |
Supported feed formats
Standards
Feeds that adhere to the following standards are supported and most properties will be parsed:
- Atom 1.0
- Atom 0.3 (no spec available but an example is here)
- RSS 2.0
- RSS 0.9
- RDF Site Summary 1.0
The following XML namespaces are also parsed, and more data will be parsed out for RSS feeds that implement these:
-
DublinCore (e.g.
dc:creator
) -
iTunes Podcast RSS feed (e.g.
itunes:author
)
Leniency
Feeds in the real world rarely comply strictly with the standards and can sometimes be invalid XML. We try to be as lenient as possible, only throwing errors if no data can be pulled out of the feed. We test against a suite of real-world feeds.
Contributing
The contributing guide is available here. All contributors must follow this library's code of conduct.
License
Licensed under the MIT license.
Copyright © 2022, Rowan Manning
Credit
This library takes inspiration from the following:
-
Feedparser from Dan MacTough which I've been using for years.