quget - web snippets from the command-line
Introduction
quget brings together the power of request, cheerio, and jQuery-like CSS selectors to the command-line.
$ quget http://news.ycombinator.com ".title > a" --limit 3Best things and stuff of 2015When coding style survives compilation: De-anonymizing programmers from binariesPostgres features and tips
$ quget http://www.google.com/search?q=the+price+of+gold "td._dmh < tr|yellow"Gold Price Per Ounce$1,075.20$3.90Gold Price Per Gram$34.57$0.13Gold Price Per Kilo$34,568.46$125.39
$ quget https://github.com/trending?since=monthly ".repo-list-name|pack" --limit 5apple / swiftFreeCodeCamp / FreeCodeCampMaximAbramchuck / awesome-interviewsoneuijs / You-Dont-Need-jQueryphanan / koel
Installation
npm install -g quget
Usage
Usage: quget [command] [options] <url> [selector] | - Example: quget http://news.ycombinator.com ".title > a|bold|red" --limit 5 Options: -o, --outfile <file> file to output to -q, --quite quite the logging -T, --template <template> template "node: {{name}}, text {{.|text}}" -l, --limit <count> limit query to count matches -r, --rand
Selectors
quget supports all CSS3, some CSS4 and custom jQuery selectors like :contains()
. For complete list see css-selelct, or run quget help selector
.
If no selector
is given, the complete HTML of the page is returned.
Attributes
In general quget returns the text()
of the matched nodes. To select an attribute, add the x-ray-like @
to the selector (before the pipes!).
selector@<attr-name>
- get an attribute by name, e.g.,selector@href
selector@text
- get text content of matched nodes recursively (default)selector@html
- get the innerHTML
Multiple attributes are supported: selector@id@class
.
Filters / Pipes
quget supports Markup.js-type pipes separated by |
, for example, selector|upcase
, selector|pack|tease 7
. For complete list see Markup.js' built-in pipes.
Need some emphasis or color? All chalk.styles are available as pipes as well: e.g. selector|red
, selector|bold|bgBlue
.
Additional pipes are defined in (src/pipes/basic.js):
|after text
- add text after each match|before text
- add text before each match|quote text
- add text before and after each match|tag name
- enclose match in <name> and </name>|incr N
- increment the match value by N (default 1)|decr N
- decrement the match value by N (default 1)|regex (foo.*)
- match by regex|colorize
- apply random chalk style to every line
Use \n
to add a new line, e.g. selector|after \n\n
. For complete list, run quget help pipes
.
Shell pipe
quget can be forced to read from STDIN, either interactively or in a shell pipe, by providing the single dash option -
. In this mode, each line of input is read as a url and executed in order. Each line may also contain its own selector
. If none is given, the selector
from the CLI is used.
$ quget http://news.ycombinator.com ".title > a@href" -l 3 | quget - "title|pack"Page not found | Docker BlogPermission to Fail - Michelle Wetzler of Keen IOThe Amazon Whisperer
Examples
For other examples, run quget samples
. (Note: samples are run using Node's child_process.exec()
which gobbles colors in output streams. To see the colors, run the command directly from the shell.)
Run samples interactively
$ quget samplesChoose a sample to run:1. Hacker News titles2. Hacker News titles and subtext3. Wikipedia's On This Day4. GitHub trending5. Markup filters6. Cheerio selectors7. Beijing Air Twitter feed8. Custom template 19. Custom template 210. Jeopardy!>
Run a sample
$ quget samples 1Running:quget http://news.ycombinator.com ".title > a" -l 7 -n Hacker News titles 1. Dear Architects: Sound Matters2. Best things and stuff of 20153. Spotify Hit with $150M Class Action Over Unpaid Royalties4. Dolphin Smalltalk Goes Open-Source5. Starters and Maintainers6. Postgres features and tips7. When coding style survives compilation: De-anonymizing programmers from binaries
Simple
$ quget http://www.google.com/search?q=the+price+of+gold "td._dmh < tr" --limit 1Gold Price Per Ounce$1,075.20$3.90
With JSON output
$ quget http://www.google.com/search?q=the+price+of+gold "td._dmh < tr" --limit 1 --json[ ]
With JSON compact
$ quget http://www.google.com/search?q=the+price+of+gold "td._dmh < tr" --limit 1 --json --compact[{"type":"tag","name":"tr","attribs":{},"children":[{"type":"tag","name":"td","attribs":{"class":"_dmh"},"children":[{"data":"Gold Price Per Ounce","type":"text"}]},{"type":"tag","name":"td","attribs":{"class":"_dmh"},"children":[{"data":"$1,075.20","type":"text"}]},{"type":"tag","name":"td","attribs":{"class":"_dmh"},"children":[{"data":"$3.90","type":"text"}]}],"selectorIndex":0}]
With line numbers
$ quget http://www.google.com/search?q=the+price+of+gold "td._dmh < tr" -n1. Gold Price Per Ounce$1,075.20$3.902. Gold Price Per Gram$34.57$0.133. Gold Price Per Kilo$34,568.46$125.39
Select at random
$ quget "https://www.reddit.com/r/oneliners/top/?sort=top&t=year" "a.title|colorize" --limit 3 --randEver since I've installed Adblock, all the single girls in my area seem to have lost interestMy poor knowledge of Greek mythology has always been my Achilles' elbow.Jokes about socialism aren't funny unless you share them with everyone.
Custom template
$ quget http://www.google.com/search?q=the+price+of+gold "td._dmh < tr|yellow" -T "#{{index|incr}} {{type|upcase}} {{name}} has {{children.length}} children: {{.|text}}"#1 TAG tr has 3 children: Gold Price Per Ounce$1,075.20$3.90 #2 TAG tr has 3 children: Gold Price Per Gram$34.57$0.13 #3 TAG tr has 3 children: Gold Price Per Kilo$34,568.46$125.39
See Markup.js for template help.
Custom HTTP headers
Any options for request can be entered in a relaxed jsonic format using --request-options
.
$ quget https://api.github.com/users/moosRequest forbidden by administrative rules. Please make sure your request has a User-Agent header . Check https://developer.github.com
Define an alias
$ alias def='function _blah(){ quget https://www.bing.com/search?q=define+$@ "#b_results ol:first-child|bold"; };_blah'$ def foo a term used as a universal substitute
Package note
quget relies on a fork of css-select which supplies the matched selector index in the list of matched elements. Since css-select is a dependency of cheerio, it ues npm shrinkwrap
to load the fork. Any updates to cheerio will require manually updating the shrinkwrap json file. Hopefully with upcoming npm 3's flat dependency tree, this shebang can be eliminated.
Change log
- 0.4.1 - Add
|tag foo
pipe. - 0.4.0 - Add
--outfile
and--quite
options - 0.3.3 - Update cheerio to 0.22.0 compatible with lodash 4.17
- 0.3.2 - Use moos/cheerio to pick up moos/css-select.
- 0.3.1 - Add
npm-shrinkwrap.json
back in as it's needed to pick up the rightcss-select
forcheerio
- 0.3.0 - Fix reading multiple URLs from STDIN
- 0.2.4 - Early version
License
(The MIT License)