wikipedia-scrapelib

1.0.1 • Public • Published

Wikipedia Scrapelib

wikipedia-scrapelib is a Node.js library that allows you to easily scrape Wikipedia pages. It provides methods to get the page content, and other feature that you can use to access various elements of the Wikipedia page. You can use this library to get page content from wikipedia.org.

Installation

$ npm install wikipedia-scrapelib

Usage

Basic Usage

const wikipedia = require('wikipedia-scrapelib');
const wiki = new wikipedia()

async function anything() {
    console.log(await wiki.page("rhombicosidodecahedron"))
}

Change Language

The default host url of wikipedia is en.wikipedia.org. If you don't like it, you can change it to your own language.

async function anything() {
    wiki.setLang('fr') // the language code of your language.
    console.log(await wiki.page("rhombicosidodecahedron"))
}

Note: the language code you enter must be within ISO Code.

Disable Type

Use the disableType or disableTypes method to disable 1 or more results from the generated page.

Supported type: header, paragraph, list and others

disableType Method

Use this method to disable 1 result from generated page content.

Example:

async function anything() {
    wiki.disableType("list")
    console.log(await wiki.page("rhombicosidodecahedron"))
}

disableTypes Method

Use this method to disable some result from page content:

Note: the parameter of type must be array of supported type.

Example:

async function anything() {
    wiki.disableTypes(["header", "list"])
    console.log(await wiki.page("rhombicosidodecahedron"))
}

⚠️ Warning: you can't use disableType and disableTypes together, pick one of them.

Result

The result will return as:

{
    title,
    page,
    links    
}

Package Sidebar

Install

npm i wikipedia-scrapelib

Weekly Downloads

3

Version

1.0.1

License

MIT

Unpacked Size

27.7 kB

Total Files

8

Last publish

Collaborators

  • kevinyustinus