Converts mediawiki collection bundles (as generated by mw-ocg-bundler) to beautiful PDFs (via XeLaTeX)



Node version 0.8 and 0.10 are tested to work.

Install the node package dependencies.

npm install

You will need to have a C compiler installed in order to build the sqlite3 and icu-bidi packages (ie, apt-get install g++).

Install other system dependencies.

apt-get install texlive-xetex texlive-latex-recommended \
                texlive-latex-extra texlive-generic-extra \
                texlive-fonts-recommended texlive-fonts-extra \
                fonts-hosny-amiri fonts-farsiweb fonts-nafees \
                fonts-arphic-uming fonts-arphic-ukai fonts-droid fonts-baekmuk \
                texlive-lang-all latex-xcolor \
                poppler-utils imagemagick librsvg2-bin libjpeg-progs \
                djvulibre-bin unzip

Note that up-to-date LaTeX hyperref and fontspec packages are required. If your LaTeX installation is old, you can find recent versions of some of the necessary packages in texdeps/, but it's best to use an up-to-date TeXlive distribution.

If you prefer, the inkscape package can be installed to do SVG->PDF conversion in place of rsvg-convert (from the librsvg2-bin package).

In older versions of Ubuntu, the Nazli font was provided by the ttf-farsiweb package instead of fonts-farsiweb.

In Ubuntu 12.04, the lmodern package must also be installed manually.

Generating bundles

You may wish to install the mw-ocg-bundler npm package to create bundles from wikipedia articles. The below text assumes that you have done so; ignore the mw-ocg-bundler references if you have bundles from some other source.


To generate a PDF named out.pdf from the en wikipedia article "United States":

mw-ocg-bundler -o --prefix en "United States"
bin/mw-ocg-latexer -o out.pdf

For debugging, preserving the XeTeX output is often useful:

bin/mw-ocg-latexer -o out.tex
TEXINPUTS=tex/: xelatex out.tex

For other options, see:

bin/mw-ocg-latexer --help

