New Horizons Science Operations Center Lyre (hereafter, "nhsoc_lyre") is a command line script for pulling down LORRI image metadata from the New Horizons project website. When this was written, the New Horizons spacecraft was well on its way to the Pluto System. After completing a flyby, it will proceed to one or more targets in the Kuiper Belt.
This program isn't officially affiliated with or endorsed by the New Horizons project.
- Parses the HTML content.
- Analyzes links to determine the number of pages.
- Extracts relevant data from the sandboxed environment.
- Decodes any encoding.
You'll need to install Node.js and npm before you can install nhsoc_lyre.
Once you have Node.js and npm, you can choose to install nhsoc_lyre automatically, or from sources.
nhsoc_lyre has been tested on Linux and Mac OS X.
On some Linux releases, you may need to alias the
node command to
nodejs to successfully install dependencies. If typing
node doesn't start up a Node.js REPL (cntl-C to get out if it does), then
sudo ln -s /usr/bin/nodejs /usr/bin/node
Although the core code should run fine on Windows from the Node.js REPL, the command line scripts to harness them as yet do not.
Change into a convenient directory and type
npm install nhsoc_lyre
This will install
./node_modules/ relative to this location. This new directory will in turn will contain a
.bin sub-directory. You should either prefix the get_nhsoc_image_metadata script with a relative path to this sub-directory from the current directory at execution time, or add it to your system path.
Installation from Sources
To install nhsoc_lyre from sources, download the source archive from github, or (after first installing git if necessary) clone the project directly from github.
Now change into the nhsoc_lyre project directory and type
Now you should be able to run the get_nhsoc_image_metadata script in
./bin, or add this sub-directory to your system path.
You shouldn't receive any error messages or be prompted to escalate shell privileges. Should either of these things happen, Node.js may not have weathered an operating system upgrade applied since its installation. You may be able to workaround by renaming or deleting the
.npm sub-directory in your home directory.
To run the download script, type
(prefixing with the appropriate relative path if necessary; see Installation, above)
You will probably find it useful to direct nhsoc_lyre's output to a file
./node_modules/.bin/get_nhsoc_image_metadata > pluto_images.csv
The default output format is CSV. If you prefer, you can specify JSON or XML
./node_modules/.bin/get_nhsoc_image_metadata > pluto_images.json --format=JSON
If you'd prefer a single page of data, provide that page's number as an argument
./node_modules/.bin/get_nhsoc_image_metadata > nhsoc_pluto_images.csv --page=1
If you elect to pull down metadata for all images on the project website, you should expect it to take a minute of time or longer.
Although nhsoc_lyre doesn't retrieve actual images, it nevertheless can request a large amount of data. Please remember that this data and the server that hosts it are provided for everyone's benefit and use them respectfully.
Files downloaded as CSV can be compared to inventory changes
./node_modules/.bin/diff_nhsoc_image_metadata_files file_1.csv file_2.csv
Inventories can be generated either in HTML or CSV format by providing a
As before, you may find it useful to redirect this script's output to a file.
./node_modules/.bin/diff_nhsoc_image_metadata_files file_1.csv file_2.csv > diff_1_2.html
The default HTML Table caption can be overridden by specifying a
To run a lint check and automated tests on nhsoc_lyre, enter
npm run test
This test suite shouldn't need a working network connection and won't attempt to contact the New Horizons project website.
newhorizonsbot, a Python tool to pull down and tweet images from the New Horizons website.
Frequently Asked Questions
Why is the script named "New Horizons Science Operations Center Lyre"?
The lyre reference is an allusion to the myth of Orpheus.
Why are the acquisition date-times in UTC, while the Last Modified date-times are in GMT?
It's typical for Last-Modified HTTP headers to be expressed in GMT. For new applications, GMT's use has largely been superseded by UTC, which is nearly but not exactly the same. For most purposes the difference can be ignored, and indeed the server may well be misrepresenting UTC as GMT.