Nutmeg Pumpkin Macchiato
Share your code. npm Orgs help your team discover, share, and reuse code. Create a free org »


1.0.6 • Public • Published

Web page Content Extractor API (wce-api)

REST API over the Web page Content Extractor (wce) node module.

Currently works with the following extractors:

  1.'s Parser API
  2. read-art
  3. node-readablity
  4. node-unfluff
  5. wce-proxy

For detailed information, please check the Webpage Content Extractor module's Github page.

Usage example

git clone wce-api
node wce-api/index.js

Docker usage example

Build the image on your local machine:

git clone wce-api
cd wce-api/docker
docker build -t mxr576/wce-api .

or pull the pre-built image from Dockerhub

docker pull mxr576/wce-api

then start a new container:

docker run -id -p 8001:8001 --name wce-api -t mxr576/wce-api

About the settings

The extractor listen on the 8001 port, by default. You can test it via

The default extractor is read-art. You can change this in the config/default.json file or you can override it with environment specific settings, for example in conf/development.json . As you can see, you can specify multiple extractor in the config file. The order of the extractors is important, because the first one will be the primary extractor and the second one will be its fallback, when the first can not extract the content of an URL.

If you would like to use the's Parser, then you have to set up your access token in the config file beforehand. You can clain your Parser key here.


Apache Licence 2.0


npm i wce-api

Downloadsweekly downloads









last publish


  • avatar
Report a vulnerability