pagefren

0.2.3 • Public • Published

pagefren

This Javascript tool helps you go from your HTML/CSS layout to a layout on pages that you have full control over.

Why? Because printing in the browser has really unpredictable behavior. But you might still want to use known tooling (HTML, CSS) to make printable material.

pagefren is created to work well with printing in Chromium, e.g. via Puppeteer.

Contents

What it does

pagefren takes layout in sections you specify, and lays it out in pages instead. Each page is simply a div with class page, making it easy to customize the exact size you want your pages.

pagefren splits HTML elements that do not fit fully onto a page, so part goes on the previous page, and the rest on the remaining page. You can easily spot elements where this has happened, as they will be tagged with [data-continues-to-next-page] and [data-continued-from-previous-page], respectively -- so you can even style them nicely. If you want to disable this behavior, you can set [data-unbreakable] on the relevant elements, and they will always be unbroken, and only flow where there is enough space for them.

If you like the functionality of unbreakable elements but don't like the whitespace they leave in your documents, you can use [data-carry-over] instead, which will place the element as the very first on the next page, if it does not fit where it is placed inline -- but other elements (after the carry-over element) will continue to flow where the element would have been. Like telling a figure to fit on "top of next page".

Titles (<h1> through <h6>) are automatically sticky meaning that they will always follow the element after them: if the first paragraph after a heading does not fit on a page, the heading flows to the next page instead. You can make other elements sticky by setting [data-sticky] on them.

You can define page numbers for your document, including relatively complex rules where e.g. your introductory pages have one type of numbering, the main text another, and finally the appendices numbered in a third way.

You can also add page references in your document by using [data-page-reference]. The value you set in the data attribute will be the selector pagefren looks for in order to determine which page number to place inside the element. For example <span data-page-reference="#hello-world"></span> will fill the span with the page number of the page on which the element with id hello-world appears. You probably want to use unique references, but anything goes: the page number will be the first element match.

Finally, there is an event, all_complete, that triggers when layouting is done. pagefren uses the layouting of the browser to calculate how much fits on a page, and that can take some time, so if you want to automate printing with Puppeteer, waiting for this event is the way to go.

Get involved

If you like this project and want to get involved, feel free to open issues, or email me with questions and suggestions.

This project is still really early stage, and I don't really know where it is going from now. You can be part of shaping that.

One place to start could be reading through this documentation, and letting me know which questions are unanswered after reading it. Or which information was hard to find, but seemed important to you. That would be very useful to me.

Other feedback is welcome, too, of course 😀

Example

Let's look at an example to get you started. Imagine you have a document that you layout nicely in HTML and CSS, and it sort of works when you print it -- but you have some issues.

A common issue is wanting to control margins on pages, but wanting different margins on different pages. That's not possible without using something like pagefren: if you set different margins for different pages, your prints will be all messed up.

Another issue might be controlling how elements flow across pages, for example if you want certain elements to never break, and others to change style when they break into the next page.

In your layout, say, perhaps there is a <ol>-list with <li>s in. You also have a <figure> that contains an image and some image text. The first page you want to control and style completely.

First, to set up, you need some base styling. pagefren will add the class page to each page, so you can handily style this, and even use this class to style your first page:

/* Printer reset: tell the printer not to add margins */
@page {
    margin: 0;
}

.page {
    /* A4 size in portrait mode */
    height: 29.7cm;
    width: 21cm;

    /* 2 cm margins on standard pages */
    padding: 2cm;
    overflow: hidden;
}

/* this class can be used together with .page to style splash pages */
.cover-page {
    /* overwrite margins, set them to 0 */
    padding: 0;
    
    /* styling: an eye-catching title page */
    background: red;
    color: white;
    display: flex;
    flex-direction: column;
    align-items: center;
    justify-content: center;
}

With this styling in place, you can create your cover page:

<div class="page cover-page">
    <h1>My document</h1>
    <h2>&hellip; and other tales</h2>
    <span>by Internet Joe</span>
</div>

That's just plain HTML that will print nicely. There is no magic so far. But now we are ready to bring in pagefren, as we have all the requisite styling.

pagefren defaults to looking after elements that match the [data-content-section] CSS-selector, and layouting all the contents of this section in <div class="page">-elements at the same level of the content section-element. Think of it as replacing the [data-content-section]-element with one or more layouted .page-elements.

So your next step would be placing all of the content you want to layout in a content section, next to your cover page:

<div class="page cover-page">
    <!-- ... -->
</div>
<div data-content-section>
    <!-- your content goes here -->
    <h1>Welcome to my document</h1>
    <p>Lorem ipsum ...</p>
    <!--
        data-carry-over means this figure will never break, but might be moved to the top of next page,
        while still allowing the list after to flow right after the `<p>`-tag above
    -->
    <figure data-carry-over>
        <img src="https://via.placeholder.com/500">
        <figcaption>Figure: A placeholder image</figcaption>
    </figure>
    <ol>
        <li>This list should break nicely across pages</li>
        <li>Event with partial list-items potentially breaking</li>
        <li>This will require some more styling to look good... but we'll help!</li>
        <li>You can look for e.g. `li[data-continued-from-previous-page]` as styling hooks</li>
    </ol>
</div>

Finally, you want to include pagefren on your page. For example by including the unpkg link (more methods below), and then running it:

<script src="https://unpkg.com/pagefren/pagefren.js"></script>
<script>
// Here, a global `pagefren` will exist.
// We wrap it in a function closure so potential errors don't propagate to the entire page.
(function() {
    // This runs pagefren with reasonable default options
    pagefren();
})();
</script>

Et viola! Now you have a page that flows nicely into controlled pages, and should work great if you print it.

(Do note that the default page size for Chromium printing to PDF is Letter because Chromium is American af, so these A4-pages will not quite fit -- change the page size to A4, however, and they will.)

There's a lot more advanced stuff you can do. All of the configuration options will be covered in the Reference below.

There is a slightly messy example in example.html, which uses some options, and also uses non-print-styling to show the pages nicely in the browser, while still printing correctly.

Including pagefren in your project

The simplest way to include pagefren is, as listed above, to include the unpkg script ([https://unpkg.com/pagefren/pagefren.js]) directly in your page.

While the script isn't really useful as a Node.js module, it is still exposed as one. And when you access it as a Node.js-script (through require), it gets a field path that points to the script location on your machine, which lets you then read the entire contents of the file to do with as you please.

In other words, if you install pagefren in your project...

npm i pagefren

...you can read the file contents of the script:

const pagefren = require("pagefren");
const fs = require("fs");

const scriptAsString = fs.readFileSync(pagefren.path, "utf8");

This should let you adapt to whatever your use case is.

Reference

This is a full reference of the interface of pagefren. In order to make it easier to get an overview, it is split into sections:

  • CSS assumptions, the things pagefren assumes about the CSS setup of the document you are running it on.
  • Javascript options, the configuration options you can give to the pagefren function to modify behavior (including page numbering).
  • Layouting options, the special things you can do to the content you are layouting, which will change how they are layouted.
  • Output, the produced HTML and hints on how to style edge cases, e.g. when a complex DOM element is split across two pages.

CSS assumptions

pagefren has no default styling, it is just Javascript. But for the layouts it generates to both be generated properly and to print properly, you need to set up a bit of CSS.

It is left to you how you wish to do this, as there can be good reasons for varying the way it is done, but below are the main considerations you need to get started:

  • @page
    When printing pages, the special @page selector takes effect. Basically, it is a CSS selector you can use to style pages for printing. It is quite clunky, and, for example, you cannot set different margins on different pages. Your best option with pagefren is probaby just to set @page { margin: 0; } on it, and handle margins on pages manually (see .page below). pagefren doesn't do this for you, so you need to include this in your CSS.

  • .page
    pagefren expects you to have styled pages to the size you expect to print. For example, an A4 page in portrait mode would have the style .page { height: 29.7cm; width: 21cm; }.
    Pages are just HTML elements with CSS styling, so you can control them however you want. For example, you can set a 2 cm margin on all your pages by setting box-sizing: border-box; padding: 2cm; on the .page class.
    It is assumed that all your .page elements will overflow in height if too many elements are placed in them. Concretely, while layouting, pagefren sets height: auto; on the page being layouted, and checks for when the height exceeds the normal height of a page. In short, pagefren will not work with pages that are expected to grow vertically, or are styled (with other attributes than height) to never overflow.

    You can use the .page class as you would normally if you want pages that you control completely. We recommend using additional classes if you want special pages with different padding or other behavior. For example, a cover page <div class="page cover-page"> can be implemented with .cover-page { padding: 0; }, overriding the default padding you have set.
    Se more details on where exactly .page-elements are output in the Output section below.

  • .page-number
    If you use page numbering in pagefren, each page's page number will be placed as text inside a <div class="page-number">-element, which means that you probably want to style this. pagefren provides no default styling.
    A basic implementation that works well on pages with 2 cm margin is something like .page-number { position: absolute; bottom: 1cm; right: 2cm; }. This would require for you to set position: relative; on .page, to allow the absolute positioning.

    We would recommend a similar approach for footers and headers on pages, possibly through :before and :after pseudo-elements. Remember: it's all HTML and CSS, use the tools you normally would.

Javascript options

pagefren is a simple Javascript function that takes a single, optional argument: pagefren([options]): PagefrenRunner.

That is, you can just call pagefren() and it will try to do something reasonable with your page. You can read about the default values and how to change them here.

Calling the function returns a PagefrenRunner which is just an event emitter: you can listen to the all_complete event in order to run code once the layouting has completed. (Layouting may take some seconds, as it uses the layouting calculations of the browser.)

For example,

const runner = pagefren();

// Print the page once layouting completes
runner.on("all_complete", () => window.print());

An example options object might look like this:

pagefren({
    context: document.getElementById("context"),
    contentSectionSelector: ".content-box",
    pageNumbers: {
        indexOffset: 1,
        numerals: "roman_lower"
    },
});

This configuration would limit pagefren to running only inside the #context element, looking for .content-box-elements to layout into .pages, and, once complete, numbering each page with lowercase roman numerals starting from "i", and continuing "ii", "iii", "iv", "v", etc.

Page numbering in particular is quite powerful, so let's get the other two arguments out of the way first, then talk about pageNumbers.

  • context
    Default: document.
    This argument is a DOM Element (a Node in the document) inside which pagefren operates. This is useful if you have some content on a page that you do not want to layout: it ensures that pagefren stays inside this context.
    It defaults to the entire document, meaning that we will per default look for context sections by doing document.querySelectorAll("[data-content-section]"). But if you pass a context, it will instead call context.querySelectorAll, which exists on all non-text nodes in a DOM element.

  • contentSectionSelector
    Default: "[data-content-section]".
    A selector determining which elements should be treated as content sections, and turned into .pages.
    Inside the context, each element matching this selector is replaced with one or more .page-elements, inside which the content flows.
    For example, say that you have a document with...

    <body>
        <main>
            <div class="page cover-page">
                <h1>My Document</h1>
            </div>
            <div data-content-section>
                <!-- some content here that will take up several pages -->
            </div>
            <div class="page">
                <h1>An intermission...</h1>
            </div>
            <div data-content-section>
                <!-- some content here that might just fit on a page -->
            </div>
        </main>
    </body>

    ... running pagefren with contentSectionSelector: "[data-content-section]" will select each of the content sections, and turn them into one or more pages. Something like this:

    <body>
        <main>
            <div class="page cover-page">
                <h1>My Document</h1>
            </div>
            <div class="page" data-first-section-page>
                <!-- a bit of the content here, but not all... -->
            </div>
            <div class="page" data-last-section-page>
                <!-- ...because the content needed more space -->
            </div>
            <div class="page">
                <h1>An intermission...</h1>
            </div>
            <div
                class="page"
                data-first-section-page
                data-last-section-page
            >
                <!-- in this section, everything fit in -->
            </div>
        </main>
    </body>

    (Notice also the data-{first,last}-section-page data attributes -- more on those in the Output-section.)

    In short, setting contentSectionSelector lets you look for something other than [data-content-section]-elements to flow into pages. It can be any CSS selector, but it should probably be pretty unique, and cannot be nested inside of itself (per default a [data-content-section] inside a [data-content-section] will behave in unexpected ways).

  • pageNumbers
    Default: undefined.
    If provided, this will trigger a page numbering of all .pages in the context. That is, it will also number .pages not created by pagefren.
    The option can be provided either as a simple object, if a single numbering system is used for all pages, or an array of values if you need more complex numbering (e.g. lowercase roman numerals for the front matter including table of contents and preface, but arabic numerals for the main text).
    Let's start with an example to get familiar with how the configuration looks:

    pagefren({
        pageNumbers: {
            fromPage: { index: 2 },
            indexOffset: 4,
            numerals: "arabic",
            indexBasis: "document"
        }
    });

    This example sets up page numbering using arabic numerals (0-9 values), with each page number being offset by 4, meaning that the first page will be page 4. The numbering excludes the first two pages, starting from the third page (that's the fromPage-part). Finally, it uses numbering relative to the whole document, meaning that the fourth page in the entire document will be the first one to be numbered.
    This is a bit of a funky example. If you just want normal page numbering with arabic numerals starting from the first page, you can use the reasonable defaults:

    pagefren({ pageNumbers: true });

    Ok, so it doesn't have to be rocket science, but for good measure let's dive into all the configuration values. After that, I'll show you an example of the roman-numerals-in-front-matter example I mentioned before.

    • fromPage
      Default: { index: 0 }.
      Determines when this page numbering starts. For example, if you want to start page numbering on the second page, you would set it to { index: 1 } instead.
      It also supports two other types of values, { has: selector } and { is: selector }, the former starting this numbering when it reaches a page that contains an element that matches the selector; the latter when the page itself matches the selector. For example, if you want to apply a numbering each time you reach a new content section that was flowed, you can use { is: "[data-first-section-page]" }.

    • indexOffset
      Default: 1.
      Determines which number should be displayed on a page, based on its index. The default value of 1 means that the first page (index 0) will be labeled 1. If you want zero-indexing, set it to 0. I'm not sure what legit use cases really exist for this.

    • numerals
      Default: "arabic".
      Decides which numerals to display the page number in. Valid values are "arabic", "roman", "roman_lower", and "hex".

    • indexBasis
      Default: "document".
      Determines the context in which the index used for numbering is found. The default value, "document", means global indexing, so the index of a .page is just its index in the list of all .page-elements in the context.
      The other supported value, "numberer", means that the index will be relative to the numberer, i.e. "since the fromPage condition triggered". For example, if you want each flowed content section to have its own numbering starting from index 0, it would make sense to use indexBasis: "numberer" along with fromPage: { is: "[data-first-section-page]" }.

    Putting this together, let's assume a document with a table of contents (that may flow onto other pages if it is long) characterized by its ID toc, and the start of the main content starting characterized by a heading with ID chapter-1. In this case, we can write a complex pageNumbers configuration, using the array syntax:

    pagefren({
        pageNumbers: [
            {
                fromPage: { has: "#toc" },
                numerals: "roman_lower",
                indexBasis: "numberer",
            },
            {
                fromPage: { has: "#chapter-1" },
                indexBasis: "numberer",
            },
        ],
    });

    It breaks down like this: the front matter uses roman_lower numerals, whereas the main content relies on the default for arabic numerals. Both are numberer-based numbering, so they start numbers from i or 1, respectively (as indexOffset defaults to 1).

Layouting options

When you are writing your HTML and CSS to be layouted by pagefren, there are a few neat tools you might want to use. pagefren tries to be reasonable by default, but it might not always fit what you want to do.

For example, pagefren per default recurses through DOM element that do not fit into a page, figuring out exactly which subelement's subelements can be included on the page, without making the content too big.

That works great if you have a paragraph of text, as you would want it to be split naturally between pages; or if you have a more complex structure, like a pragraph inside a <div> that makes everything into columns (e.g. with the CSS-attribute columns: 3;).

It doesn't work so great if pagefren suddenly starts splitting your <figure>s apart from their <figcaption>s. Here, you could set the attribute data-unbreakable on the figures, to ensure that pagefren doesn't split them.

Here follows a breakdown of all the layouting options that exist. After that a list of default special behavior of some elements (that don't behave like any other element would).

  • data-unbreakable
    Signifies that pagefren should simply not try and break this element apart. If the element does not fit into a page, content on that page will stop, and the unbreakable element will be the first element on the next page.

  • data-carry-over
    Unbreakable elements (above) have the disadvantage of stopping content where they are placed. Sometimes, you want the content following the unbreakable element to continue where the unbreakable element would have been.
    If, for example, you have a structure like this...

    <p>Some text</p>
    <figure data-unbreakable>
      <img src="https://via.placeholder.com/150">
      <figcaption>A caption</figcaption>
    </figure>
    <p>Some more text</p>

    ...you would probably want "Some text" to be followed by "Some more text" in the case where the figure does not fit on the page.
    In this case you can use data-carry-over instead of data-unbreakable. This will keep the element unbreakable, but when it doesn't fit on a page,the content immediately after will try to flow on that page. The carried over element is always placed as the first element on the next page.

  • data-sticky
    Some elements have a relationship with each other, despite not being nested. In particular, it is common to make sure that a heading is always on the same page as the first bit of text following it.
    If there is only room for the heading on the page (and not any of the following text), the heading, too, will be moved to the next page.

    <!--
        This ensures that the overline will always be on the same
        page as at least some of the paragraph text
    -->
    <div class="overline" data-sticky>A sticky heading</div>
    <p>
      The text to which the sticky heading should stick. The
      paragraph might still be split, e.g. leaving only the
      first line on the current page. As long as <i>some</i> of
      it fits on the same page, the sticky heading will stay in
      place.
    </p>
  • data-page-reference
    The page reference is a slightly special data-attribute. It doesn't change the layouting of the element it is set to -- instead, it changes the content.
    The value of the attribute is a CSS selector, and the content if the element will be changed to the page number of the page that is or contains the element described by the selector.
    For example...

    See the appendix on page <span data-page-reference="#appendix">00</span>.

    ...might result in the text heading "See the appendix on page 34", given that the element with ID appendix is on the page numbered 34.
    Of course, this data attribute only works if page numbering is enabled.

These are the nodes that don't break and flow as all other elements:

  • Headings are sticky and unbreakable
    The HTML heading elements <h1> through <h6> always behave as if data-sticky was set on them: they will always stick with the element following them. They also are always data-unbreakable, but that is kind of implied by data-sticky.
  • Figures never break
    You don't need to set data-unbreakable on a <figure>: it behaves like that already. You can still modify the behavior by setting data-sticky or data-carry-over on them, though.

Output

In this section, we'll take a closer look at what you can expect the output of pagefren to look like.

  • Pages
    As mentioned above, pagefren works in some context that it never looks outside (but often this will just be the entire document). Within the context it looks for content sections, described by some selector (default [data-content-section]), and replaces these content sections with one or more layouted .pages (specifically, <div class="page">-elements). That is, for each content section, you can expect at least one, but potentially several, pages.
    As helpers for styling or referencing in your document, each content section's first and last page are marked with the data attributes [data-first-section-page] and [data-last-section-page], respectively. If there is exactly one page produced by a content section, it will have both of these attributes.

  • Elements split between pages
    Whenever an element doesn't fully fit onto a page (that is, adding the element in full to the page would make the content of the page overflow), pagefren will figure out what to do.
    If the element is a special element or has layout options applied to it, as detailed in the Layouting options section above, those rules will be followed.
    Otherwise, pagefren recursively traverses through the element, attempting to split at exactly the right element so as much as possible fits onto the page, including splitting sentences at the correct space. For example, if the content section does not fit into a page...

    <div data-content-section>
      <div class="columns">
        <!-- imagine some content before the list -->
        <ul>
          <li>A list is an element</li>
          <li>With one or more list items</li>
          <li>
            Each of which is displayed below the preceeding
            elements
          </li>
          <li>
            Making for a <strong>great example</strong> of
            how pagefren splits content
          </li>
          <li>Or at least so it could be argued</li>
        </ul>
        <!-- imagine some more content after the list -->
      </div>
    </div>

    ... might be split by pagefren into the following pages (indentation modified for readability):

    <div class="page" data-first-section-page>
      <div class="columns" data-continues-to-next-page>
        <!-- imagine some content before the list -->
        <ul data-continues-to-next-page>
          <li>A list is an element</li>
          <li>With one or more list items</li>
          <li>
            Each of which is displayed below the preceeding
            elements
          </li>
          <li data-continues-to-next-page>
            Making for a <strong data-continues-on-next-page>great</strong>
          </li>
        </ul>
      </div>
    </div>
    <div class="page" data-last-section-page>
      <div class="columns" data-continued-from-previous-page>
        <ul data-continued-from-previous-page>
          <li data-continued-from-previous-page>
            <strong data-continued-from-previous-page>
                example
            </strong>
            of how pagefren splits content
          </li>
          <li>Or at least so it could be argued</li>
        </ul>
        <!-- imagine some more content after the list -->
      </div>
    </div>

    This is really the core strength of pagefren: allowing content to flow across pages. Each element that was not fully included in the first page is marked with data-continues-to-next-page, and the continuation of those elements on the next page is marked with data-continued-from-previous-page.
    In the example of the unordered list, the first entry on the second page will, without any further styling, have a bullet, despite not being a new point in the list. The data attributes let us fix this with some simple CSS:

    li[data-continued-from-previous-page] {
      list-style: none;
    }

    That is the power of these data attributes: allowing full control of styling of elements layouted across several pages.
    An element can flow across more than two pages. In this case, all the middle pages will have both data attributes.

TODO

  • diagram numbering
  • anti-dangly-line (if a p or li tag split would result in there being one line left on either page, do not split), maybe options-configurable which elements should anti-dangle?
  • pageNumbers: support a function given to numerals which returns the display-number
  • content section-pages should inherit reasonable things from the content sections. like the classes, probably? and data attributes? so each page gets those from the content section. or maybe all attributes except ID?
    • the same probably applies to when we create a new element with [data-continued-from-last-page]: it should copy attributes
  • break words on ­ and hyphens in #text nodes

Readme

Keywords

Package Sidebar

Install

npm i pagefren

Weekly Downloads

1

Version

0.2.3

License

ISC

Unpacked Size

49.4 kB

Total Files

4

Last publish

Collaborators

  • hypesystem