Miss any of our Open RFC calls?Watch the recordings here! »

mwoffliner

1.10.10 • Public • Published

MWoffliner

MWoffliner is a tool for making a local offline HTML snapshot of any online Mediawiki instance. It goes through all articles (or a selection if specified) and create the corresponding ZIM file to a local directory. It has mainly been tested against Wikimedia projects like Wikipedia, Wiktionary, ... But it should also work for any recent Mediawiki.

Read CONTRIBUTING.md to know more about MWoffliner development.

NPM

npm Docker Build Status Build Status codecov CodeFactor License

Prerequisites

  • *NIX Operating System (GNU/Linux, macOS, ...)
  • NodeJS
  • Redis
  • Libzim (On linux we automatically download binaries)
  • Various build tools that are probably already installed on your machine (libjpeg, gcc)

See Environment setup hints to know more about how to install them.

Usage

To install MWoffliner globally:

npm i -g mwoffliner

You might need to run this command with the sudo command, depending how your npm is configured.

Then to run it:

mwoffliner --help

To use MWoffliner with a S3 cache, you should provide a S3 URL like this:

--optimisationCacheUrl="https://wasabisys.com/?bucketName=my-bucket&keyId=my-key-id&secretAccessKey=my-sac"

API

MWoffliner provides also an API and therefore can be used as a NodeJS library. Here a stub example:

const mwoffliner = require('mwoffliner');
const parameters = {
    mwUrl: "https://es.wikipedia.org",
    adminEmail: "foo@bar.net",
    verbose: true,
    format: "nopic",
    articleList: "./articleList"
};
mwoffliner.execute(parameters); // returns a Promise

Background

Complementary information about MWoffliner:

  • MediaWiki software is used by dozen of thousands of wikis, the most famous ones being the Wikimedia ones, including Wikipedia.
  • MediaWiki is a PHP wiki runtime engine.
  • Wikitext is the name of the markup language that MediaWiki uses.
  • MediaWiki includes a parser for WikiText into HTML, and this parser creates the HTML pages displayed in your browser.
  • There is another WikiText parser, called Parsoid, implemented in Javascript/NodeJS. MWoffliner uses Parsoid.
  • Parsoid is planned to eventually become the main parser for MediaWiki.
  • MWoffliner calls Parsoid and then post-processes the results for offline format.

Environment setup hints

macOS

Install NodeJS:

curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.11/install.sh | bash && \
source ~/.bashrc && \
nvm install stable && \
node --version

Install Redis:

brew install redis

Install libzim: Read these instructions

GNU/Linux - Debian based distributions

Install NodeJS:

curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.11/install.sh | bash && \
source ~/.bashrc && \
nvm install stable && \
node --version

Install Redis:

sudo apt-get install redis-server

Releasing

  1. Update package.json
  2. Commit :package: Release version vX.X.X
  3. Run git tag vX.X.X
  4. Run git push origin master --tags

License

GPLv3 or later, see LICENSE for more details.

Install

npm i mwoffliner

DownloadsWeekly Downloads

2,838

Version

1.10.10

License

GPL-3.0

Unpacked Size

699 kB

Total Files

111

Last publish

Collaborators

  • avatar
  • avatar
  • avatar