Entity Relationship Diagram generator for BigTable-style databases like HBase and Accumulo.
Each table is generated from a separate YAML file (.yaml or .yml), parsed based on the format described below. There are four root level elements: table, rowKey, columnFamilies, and connections. Each file must start with a
--- and end with a
table element includes two subproperties:
label element is required, but the
description element is optional and currently unimplemented. For a sample table which holds articles for a news outlet, the
table property might look like this:
table:label: Articledescription: Information about the articles published by our news source # Currently unimplemented
rowKey element includes one subproperty per piece of information that is contained in the element. This format includes three parts: the name (e.g. title), the type (e.g. string), and a description of the property (e.g. The title of the element). The name and type are required, but the description is optional. Each subproperty employs one of the following two formats, depending on whether the description is included or not:
name: type - descriptionname: type
rowKey for an article for a news outlet might look like this:
rowKey:articleTitle: string - The title of the articlearticleAuthor: string - The author of the article
columnFamilies element holds the structure of the table itself and has subproperties which are the names of the column families that are contained in the table. Each column family in turn has a subproperty for each column qualifier which is contained within that column family. The column qualifier properties are structured in the same manner as the properties within
rowKey described above with name, type and description (optional) portions.
columnFamilies property from an article for a news outlet:
columnFamilies:Content:rawText: string - The raw text of the articlehtmlText: string - The html-formatted text of the articleMetadata:author: string - The author of the articlepostTime: long - The UNIX timestamp for the post-time of the articlelastUpdateTime: long - The UNIX timestamp for the time when the article was last updated
Column families with no column qualifiers are also acceptable, though unusual in practice.
connections element holds a set of properties which show the relationships between this table and other tables. Each connection to another table is specified by a subproperty which holds the name of the table to connect to and a description of the relationship between them. They follow the format:
tableName: quantifier -> quantifier
Each quantifier can hold one of three values,
none. Anything else is assumed to be
none, and no arrowhead is drawn. Going back to our article example, if we had another table holding multiple
Comments on each Article, there would a one to many relationship between the
Comment tables, and the following connections property in the
connections:Comment: one -> many
This poses a potential problem. If I identify the relationship from the opposite direction in my
Comment table, would there be two relationships drawn? In short, no. The two are identified as duplicates, and only one arrow is drawn.
Overall Article YAML File
---table:label: Articledescription: Information about the articles published by our news source # Currently unimplementedrowKey:articleTitle: string - The title of the articlearticleAuthor: string - The author of the articlecolumnFamilies:Content:rawText: string - The raw text of the articlehtmlText: string - The html-formatted text of the articleMetadata:author: string - The author of the articlepostTime: long - The UNIX timestamp for the post-time of the articlelastUpdateTime: long - The UNIX timestamp for the time when the article was last updatedconnections:Comment: one -> many...