Noodle Printing Machine

    puppeteer-prerender

    0.14.0 • Public • Published

    puppeteer-prerender

    puppeteer-prerender is a library that uses Puppeteer to fetch the pre-rendered html, meta, links, and Open Graph of a webpage, especially Single-Page Application (SPA).

    Usage

    const Prerenderer = require('puppeteer-prerender')
     
    async function main() {
      const prerender = new Prerenderer()
     
      try {
        const {
          status,
          redirect,
          meta,
          openGraph,
          links,
          html,
          staticHTML 
        } = await prerender.render('https://www.example.com/')
      } catch (e) {
        console.error(e)
      }
     
      await prerender.close()
    }
     
    main()

    APIs

    new Prerenderer(options)

    Creates a prerenderer instance.

    Default options:

    {
      // Boolean | Function. Whether to print debug logs.
      // You can provide your custom log function, it should accept same arguments as console.log()
      debug: false,
     
      // Object. Options for puppeteer.launch().
      // see https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#puppeteerlaunchoptions
      puppeteerLaunchOptions: undefined,
     
      // Number. Maximum navigation time in milliseconds.
      timeout: 30000,
     
      // String. Specific user agent to use in this page. The default value is set by the underlying Chromium.
      userAgent: undefined,
     
      // Boolean. Whether to follow 301/302 redirect.
      followRedirect: false,
     
      // Object. Extra meta tags to parse.
      extraMeta: undefined,
      
      // Object. Options for parse-open-graph.
      // see https://github.com/kasha-io/parse-open-graph#parsemeta-options
      parseOpenGraphOptions: undefined,
      
      // Array. Rewrite URL to another location.
      rewrites: undefined
    }

    extraMeta

    Extra meta tags to parse. e.g.:

    {
      status: { selector: 'meta[http-equiv="Status" i]', property: 'content' },
      icon: { selector: 'link[rel~="icon"]', property: 'href' }
    }

    The property name is the name of property which will be set in result.meta object. selector is the parameter of document.querySelector() which used to select the element. property is the property of the selected element which contains the value.

    rewrites

    const result = await prerender.render('https://www.google.com/foo', {
      rewrites: [
        ['https://www.google.com/:path(.*)', 'https://www.example.com/:path'],
        ['https://www.googletagmanager.com/(.*)', ''] // block
      ]
    })

    The page will load from https://www.example.com/foo instead of https://www.google.com/foo. And requests to https://www.googletagmanager.com/* will be blocked.

    It uses url-rewrite module underlying.

    prerenderer.render(url, options)

    Prerenders the page of the given url.

    Returns: Promise.

    These options can be overrided:

    {
      timeout,
      userAgent,
      followRedirect,
      extraMeta,
      parseOpenGraphOptions,
      rewrites
    }

    Return format:

    {
      status, // HTTP status code
      redirect, // the redirect location if status is 301/302
     
      meta: {
        title,
        description, // <meta property="og:description"> || <meta name="description">
        image, // <meta property="og:image"> or first <img> which width & height >= 300
        canonicalURL, // <link rel="canonical"> || <meta property="og:url">
     
        // <meta rel="alternate" hreflang="de" href="https://m.example.com/?locale=de">
        locales: [
          { lang: 'de', href: 'https://m.example.com/?locale=de' },
          // ...
        ],
     
        // <meta rel="alternate" media="only screen and (max-width: 640px)" href="https://m.example.com/">
        media: [
          { media: 'only screen and (max-width: 640px)', href: 'https://m.example.com/' },
          // ...
        ],
     
        author, // <meta name="author">
     
        // <meta property="article:tag"> || <meta name="keywords"> (split by comma)
        keywords: [
          'keyword1',
          // ...
        ]
     
        /*
          extraMeta will also be set in here
        */
      },
     
      openGraph, // Open Graph object
     
      // The absolute URLs of <a> tags.
      // Useful for crawling the next pages.
      links: [
        'https://www.example.com/foo?bar=1',
        // ...
      ],
     
      html // page html
      staticHTML // static html (scripts removed)
    }

    The openGraph object format:

    {
      og: {
        title: 'Open Graph protocol',
        type: 'website',
        url: 'http://ogp.me/',
        image: [
          {
            url: 'http://ogp.me/logo.png',
            type: 'image/png',
            width: '300',
            height: '300',
            alt: 'The Open Graph logo'
          },
        ]
        description: 'The Open Graph protocol enables any web page to become a rich object in a social graph.'
      },
      fb: {
        app_id: '115190258555800'
      }
    }

    See parse-open-graph for details.

    prerenderer.close()

    Closes the underlying browser.

    prerenderer.debug

    Opens or disables debug mode.

    prerenderer.timeout

    Sets the default timeout value.

    prerenderer.userAgent

    Sets the default user agent.

    prerenderer.followRedirect

    Sets the default value of followRedirect.

    prerender.extraMeta

    Sets the default value of extraMeta.

    prerender.parseOpenGraphOptions

    Sets the default value of parseOpenGraphOptions.

    License

    MIT

    Install

    npm i puppeteer-prerender

    DownloadsWeekly Downloads

    20

    Version

    0.14.0

    License

    MIT

    Unpacked Size

    26.7 kB

    Total Files

    15

    Last publish

    Collaborators

    • jiangfengming