Validate CSV files and convert them to metadata from the comfort of your CouchDB.
And you thought metadata wasn't any fun. This application helps find and correct common mistakes in metadata records.
You need to have two pre-requesite software packages installed: Node.js and CouchDB. Fortunately they are not hard to install.
You can find an installer package at http://nodejs.org/download that is appropriate for your computer. For our Windows users, I would suggest picking the appropriate one of these:
Just download and run the installer. You can verify that it worked by opening a command prompt and typing
node. If there are no errors and the prompt changes to a
>, then it worked.
You can find an installer package at https://couchdb.apache.org/#download that is appropriate for your computer. After clicking one of the links, it will ask you to choose a "mirror", but you should just take the first one that it suggests for you. For our Windows users, this will get you an
.exe file that you can execute to install CouchDB.
During installation, choose the default options to run CouchDB as a service, and to start the service automatically. Once the installation is complete, you can verify that it installed correctly by pointing your web browser to http://localhost:5984/_utils.
You can also install these pre-requisites in other ways (Homebrew or apt-get, or whatever), as long as you know how to use them properly.
Open a command prompt. You'll have to use the command prompt, because its good for you and it is totally not hard at all.
npm install -g cushions. You can copy/paste if you'd like to, but you might find typing to be more rewarding.
You'll still need that command prompt open, and you'll need a CSV file that at least tries to conform to our compilation template. You'll need to know the path to your CSV file. Something like
C:\Users\Ryan\Documents\my-csv-file.csv is probably what you'll be looking for.
Now you've got that in place, you're going to need to be sure you've got CouchDB turned on (check that http://localhost:5984 gives you a little JSON).
Okay, so load some data:
stuffing -d your-new-database -f path-to-your-csv
So, that might end up looking like this:
stuffing -d stanford -f C:\Users\Ryan\Documents\stanford-metadata.csv
This will create a database called
stanford and load records into it from the CSV file at
Loading will take a little bit of time, depending on the number of records. It will let you know how long it took when its done. Expect it to take about 30 seconds for about 13,000 records.
You need to turn on Cushions now. From a command prompt type:
If you see
info - socket.io started, then it worked. Leave that command prompt alone while you work.
Now Cushions are ready to protect you: your data is loaded into CouchDB, and it is ready to be validated. Go to this URL in your web-browser:
So, in my case:
This will bring up a web site that will begin validating your records. You'll see a list of the criteria that are being used for validation. As the validation completes, it will tell you, for each criteria, how many records are valid, and how many are invalid.
You can click on the results to see a JSON document that describes which records are valid or invalid. You should definintely use a browser extension that will make the JSON prettier so you can read it.
If the record was invalid, you might get a report describing why it was invalid like
"problem": "The title was missing"
"problem": "The publication date was invalid"
If you're lucky, your problem can be automatically corrected. In that case, there will be a
suggestion which indicates what Cushions thinks might be what your metadata compiler meant. You can use Cushions to automatically apply these
suggestions to your metadata content.
- Go to the validation page:
- Push the Apply Automatic Corrections button.
- Wait a little while. Not too long, it will tell you when its done.
Read some other documentation that I haven't written just yet.