@gmod/ucsc-hub
TypeScript icon, indicating that this package has built-in type declarations

0.3.0 • Public • Published

ucsc-hub-js

read and write UCSC track and assembly hub files in node or the browser

Status

Build Status NPM version Coverage Status

Usage

Read about hub.txt, genomes.txt, and trackDb.txt files here: https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html

Files are essentially JavaScript Maps. A hub.txt file is a map with keys as the first word in each line and the value as the rest of the line, like this:

Map {
  "hub" => "UCSCHub",
  "shortLabel" => "UCSC Hub",
  "longLabel" => "UCSC Genome Informatics Hub for human DNase and RNAseq data",
  "genomesFile" => "genomes.txt",
  "email" => "genome@soe.ucsc.edu",
  "descriptionUrl" => "ucscHub.html",
}

genomes.txt and trackDb.txt files are two-deep Maps where the keys are the values of the first line of each section and the value is a Map of the lines in that whole section, like this:

Map {
  "hg18" => Map {
    "genome" => "hg18",
    "trackDb" => "hg18/trackDb.txt",
  },
  "hg19" => Map {
    "genome" => "hg19",
    "trackDb" => "hg19/trackDb.txt",
  },
  "newOrg1" => Map {
    "genome" => "newOrg1",
    "trackDb" => "newOrg1/trackDb.txt",
    "twoBitPath" => "newOrg1/newOrg1.2bit",
    "groups" => "newOrg1/groups.txt",
    "description" => "Big Foot V4",
    "organism" => "BigFoot",
    "defaultPos" => "chr21:33031596-33033258",
    "orderKey" => "4800",
    "scientificName" => "Biggus Footus",
    "htmlPath" => "newOrg1/description.html",
  },
}

Map {
  "dnaseSignal" => Map {
    "track" => "dnaseSignal",
    "bigDataUrl" => "dnaseSignal.bigWig",
    "shortLabel" => "DNAse Signal",
    "longLabel" => "Depth of alignments of DNAse reads",
    "type" => "bigWig",
  },
  "dnaseReads" => Map {
    "track" => "dnaseReads",
    "bigDataUrl" => "dnaseReads.bam",
    "shortLabel" => "DNAse Reads",
    "longLabel" => "DNAse reads mapped with MAQ",
    "type" => "bam",
  },
}

Example usage for a "standard" multi-file hub:

const fs = require('fs')
const { HubFile, GenomesFile, TrackDbFile } = require('@gmod/ucsc-hub')

const hubFile = new HubFile(fs.readFileSync('hub.txt', 'utf8'))
console.log(hubFile.get('genomesFile'))
// ↳ genomes.txt

const genomesFile = new GenomesFile(fs.readFileSync('genomes.txt', 'utf8'))
console.log(genomesFile.get('hg19').get('trackDb'))
// ↳ hg19/trackDb.txt

const trackDbFile = new TrackDbFile(fs.readFileSync('hg19/trackDb.txt', 'utf8'))
console.log(trackDbFile.get('dnaseSignal').get('shortLabel'))
// ↳ DNAse Signal

Example usage for a single-file hub:

const fs = require('fs')
const { SingleFileHub } = require('@gmod/ucsc-hub')

const hubFile = new HubFile(fs.readFileSync('hub.txt', 'utf8'))
console.log(hubFile.get('genomesFile'))
// ↳ genomes.txt

const genomesFile = new GenomesFile(fs.readFileSync('genomes.txt', 'utf8'))
console.log(genomesFile.get('hg19').get('trackDb'))
// ↳ hg19/trackDb.txt

const trackDbFile = new TrackDbFile(fs.readFileSync('hg19/trackDb.txt', 'utf8'))
console.log(trackDbFile.get('dnaseSignal').get('shortLabel'))

API

Table of Contents

GenomesFile

Extends RaFile

Class representing a genomes.txt file.

Parameters

  • genomesFile (string | Array<string>) A genomes.txt file as a string (optional, default [])
  • Throws Error Throws if the first line of the hub.txt file doesn't start with "genome <genome_name>" or if it has invalid entries

HubFile

Extends RaStanza

Class representing a hub.txt file.

Parameters

  • Throws Error Throws if the first line of the hub.txt file doesn't start with "hub <hub_name>", if it has invalid entries, or is missing required entries

RaFile

Extends Map

Class representing an ra file. Each file is composed of multiple stanzas, and each stanza is separated by one or more blank lines. Each stanza is stored in a Map with the key being the value of the first key-value pair in the stanza. The usual Map methods can be used on the file. An additional method add() is available to take a raw line of text and break it up into a key and value and add them to the class. This should be favored over set() when possible, as it performs more validity checks than using set().

Parameters

  • raFile (string | Array<string>) An ra file, either as a single string or an array of strings with one stanza per entry. Supports both LF and CRLF line terminators. (optional, default [])

  • options object

    • options.checkIndent boolean [true] - Check if a the stanzas within the file are indented consistently and keep track of the indentation

Properties

  • nameKey (undefined | string) The key of the first line of all the stanzas (undefined if the stanza has no lines yet).
  • Throws Error Throws if an empty stanza is added, if the key in the first key-value pair of each stanze isn't the same, or if two stanzas have the same value for the key-value pair in their first lines.

RaStanza

Class representing an ra file stanza. Each stanza line is split into its key and value and stored as a Map, so the usual Map methods can be used on the stanza.

Parameters

SingleFileHub

Class representing a "single-file" hub.txt file that contains all the sections of a hub in a single file.

Parameters

TrackDbFile

Extends RaFile

Class representing a genomes.txt file.

Parameters

  • trackDbFile (string | Array<string>) A trackDb.txt file as a string (optional, default [])
  • options any?
  • Throws Error Throws if "track" is not the first key in each track or if a track is missing required keys

settings

Gets all track entries including those of parent tracks, with closer entries overriding more distant ones

Parameters
  • trackName string The name of a track
  • Throws Error Throws if track name does not exist in the trackDb

License

MIT © Generic Model Organism Database Project

Dependents (3)

Package Sidebar

Install

npm i @gmod/ucsc-hub

Weekly Downloads

257

Version

0.3.0

License

MIT

Unpacked Size

212 kB

Total Files

120

Last publish

Collaborators

  • teresam856
  • nathandunn
  • rbuels
  • enuggetry
  • cmdcolin
  • garrettjstevens