genehood-cli

Command-line interface to generate GeneHood datasets.

Dependencies

Genehood needs nodeJS version 10+ and ncbi-tools+ version 2.6+ to run.

Install

npm install -g genehood-cli

Usage

GeneHood uses MiST3 API to collect the necessary information needed for the analysis. Thus, the only inputs required from the user are:

A list of reference genes,
how many upstream and downstream genes should be in the analysis.
Phylogenetic analysis in Newick format (optional)

Reference Genes

GeneHood reads a list of reference genes from the user and searches for the upstream and downstream information from those genes on MiST3.

For this reason, GeneHood uses the MiST3 standard for gene identifiers called stable id.

It is a composite of the NCBI genome version and the locus number of the gene.

Here are some examples:

MiST3 stable id	description
`GCF_000005845.2-b4355`	Chemoreceptor tsr(b4355) from Escherichia coli str. K-12 substr. MG1655
`GCF_000006765.1-PA5040`	Secretin pilQ(PA5040) from the Pseudomonas aeruginosa PAO1
`GCF_000006905.1-CC_2066`	part of L-ring flgH(CC_2066) from Caulobacter crescentus CB15

Performing GeneHood analysis

Once Genehood-cli is installed globally (-g option), NPM generates an executable called: genehood.

genehood takes one argument as the name of the project (in this example myNewProject) and a mandatory --action flag with four possible values:

value	description
`init`	Initializes the configuration and data file for the project
`run`	Starts a new run from an existing configuration file
`keepGoing`	It restarts a run from the last successful step of the analysis pipeline
`cleanUp`	Delete the temporary files generated by GeneHood

Step 1: Initialize the project

To start a new analysis, we must initialize a new project.

genehood myProject --action init

This command will generate two files:

myProject.geneHood.config.json
myProject.geneHood.data.josn.gz

Now, we must edit the config file to tell GeneHood to which genes it should collect gene neighborhood information.

Step 2: Edit the config file to set initial parameters

genehood-cli version 0.2.8 has flags to facilitate this process, see below.

There are several parts in the GeneHood config file, but what matters is under the section user. There we will find three sub-sections:

section	description
`settings`	This is where all the input data goes
`newickTree`	This is where we should add a Newick tree (optional)
`startingStep`	For advanced users if they want to start from a different step other than the default
`stopStep`	For advanced users that don't want to run the entire pipeline

Let's focus on the settings section first. It has three sub-sections that need user input:

section	description
`stableIds`	This is where we will add reference genes using MiST3 stable identifier
`upstream`	Integer of how many genes should be collected upstream from the reference gene
`downstream`	Integer of how many genes should be collected downstream from the reference gene
`geneHoodPrefix`	This is pre-filled with the name of the GeneHood project.

For example, let us add as reference genes the _cheA_s from the three chemosensory systems in the Vibrio cholerae:

system	stable Ids
F6	`GCF_000006745.1-VC2063`
F7	`GCF_000006745.1-VCA1095`
F9	`GCF_000006745.1-VC1397`

and also, let us include 15 genes upstream and 15 downstream from the reference genes.

To do that, we can edit the config file using any text editor.

The user section of the config file will be something like this:

"user": {
 "newickTree": "",
 "settings": {
  "downstream": 15,
  "geneHoodPrefix": "vibrio",
  "stableIds": [
   "GCF_000006745.1-VC1397",
   "GCF_000006745.1-VC2063",
   "GCF_000006745.1-VCA1095"
  ],
  "upstream": 15
 },
 "startingStep": "fetchData",
 "stopStep": ""
}

Save the file and proceed to the next step.

Step 2 (alternative): Set parameters using flags.

We can set the genes downstream and upstream using --addRange

We can add the identifiers to a text file (one identifier per line) and pass to genehood using the flag --addStableIds.

If we put the identifiers into a file named vibrioIds.txt, we can accomplish the same setup as before by typing:

genehood myProject --addRange 10 10 --addStableIds vibrioIds.txt

Step 3: Running GeneHood

Make sure we have an Internet connection and that blastp and makeblastdb are executables in our systems.

then run:

genehood myProject --action run

That is it. GeneHood should do all the rest.

Step 4: Clean up

If everything goes as expected, we should have a file called myProject.geneHood.pack.json.gz in our directory. It probably should have a bunch of other files that GeneHood used temporarily.

We can safely remove these temp files using the action cleanUp from genehood:

genehood myProject --action cleanUp

GeneHood cleans all the files but 2: the config file and the pack file. It is a little redundant since GeneHood's pack also contains the config file. We made it this way to facilitate for the user to see how they ran the analysis or to re-run the analysis with few changes in the config file, if needed.

Now we just need to visualize the data.

Optional step 4.5: Add Phylogeny

We can add a phylogeny (in Newick format) to the config file at any moment, and the genehood-cli API has a helper option: --addPhylogeny. If we add the phylogeny after the pack has been built, genehood-cli will repack the file for us.

Adding phylogeny will let the viewer to order the gene clusters following the order of the phylogenetic tree. The tree can be built in any way: single gene, multiple concatenated genes and etc. However, in order for the viewer to work the names of the leafs need to be exactly the same as the identifiers of the reference genes.

To add a new phylogeny:

genehood myProject --addPhylogeny myPhylogeny.nwk

Step 5: Load the data on genehood.io

Open the GeneHood on a web browser and load the myProject.geneHood.pack.json.gz.

Now just explore the data.

To learn more about the GeneHood viewer, go to genehood.io and click in Demo.

Developers Documentation

Developer's Documentation

... to be continued.

Written with ❤ in Typescript.

genehood-cli

genehood-cli

Dependencies

Install

Usage

Reference Genes

Performing GeneHood analysis

Step 1: Initialize the project

Step 2: Edit the config file to set initial parameters

Step 2 (alternative): Set parameters using flags.

Step 3: Running GeneHood

Step 4: Clean up

Optional step 4.5: Add Phylogeny

Step 5: Load the data on genehood.io

Developers Documentation

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

Weekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

genehood-cli

genehood-cli

Dependencies

Install

Usage

Reference Genes

Performing GeneHood analysis

Step 1: Initialize the project

Step 2: Edit the config file to set initial parameters

Step 2 (alternative): Set parameters using flags.

Step 3: Running GeneHood

Step 4: Clean up

Optional step 4.5: Add Phylogeny

Step 5: Load the data on genehood.io

Developers Documentation

Readme

Keywords

Package Sidebar

Install

Repository

Homepage

DownloadsWeekly Downloads

Version

License

Unpacked Size

Total Files

Last publish

Collaborators

Weekly Downloads