puppetree
TypeScript icon, indicating that this package has built-in type declarations

1.0.0 • Public • Published

Puppetree

Puppetree is a wrapper around puppeteer built in with JSDOM, to allow webscraping/crawling from node using the client side DOM architecture.

  • API usage is the same as with puppeteer; however, puppetree adds 5 new query selectors as you would use on the DOM.

  • Puppetree adds querySelector, querySelectorAll, getElementById, getElementsByClassName, and getElementsByTagName

  • Each returning a HybridElement of puppeteers ElementHandle and the DOMs HTMLElement.

Getting Started

const puppetree = require('puppetree');
 
const browser = await puppetree.launch();
const hybridPage = await browser.newPage();
await hybridPage.goto(url);

<HybridPage>.querySelector

const $hyperlink = await hybridPage.querySelector('a.mylink');
console.log($hyperLink.href) // Logs HTMLAnchorElement href

<HybridPage>.querySelectorAll

const $inputs = await hybridPage.querySelectorAll('div.container input');
for (const $input of $inputs) {
    console.log($input.value) // Logs HTMLInputElement value
}

<HybridPage>.getElementById

const $button = await hybridPage.getElementById('search');
await $button.click(); // Uses ElementHandle click api

<HybridPage>.getElementsByClassName

const $people = await hybridPage.getElementsByClassName('person');
for (const $person of $people) {
    await $person.hover() // Uses ElementHandle hover api
}

<HybridPage>.getElementsByTagName

const $rows = await hybridPage.getElementsByTagName('tr');
for (const $row of $rows) {
    const $p = await $row.querySelector('td p');
    console.log($p.text); // Uses HTMLParagraphElement
}

Readme

Keywords

none

Package Sidebar

Install

npm i puppetree

Weekly Downloads

0

Version

1.0.0

License

MIT

Unpacked Size

202 kB

Total Files

24

Last publish

Collaborators

  • swimauger