flusight-csv-tools
Node toolkit for CDC FluSight format CSVs. Full documentation here. Provides features for:
- Parsing CSVs (
fct.Csv
class) - Verifying CSVs (
fct.verify
module) - Scoring targets (
fct.score
module) - Fetching true values (
fct.truth
module) - Metadata related to CDC FluSight (
fct.meta
module) - Utilities for working with
- Bin distributions (
fct.utils.bins
module) - Time and epiweeks (
fct.utils.epiweek
module)
- Bin distributions (
Quickstart
# Install from npm
npm i flusight-csv-tools
// Read a csv
const fct = require('flusight-csv-tools')
let csv = new fct.Csv('./test/data/sample.csv', 201720, 'model-name')
// Verify
fct.verify.verifyHeaders(csv)
fct.verify.verifyPoint(csv)
fct.verify.verifyProbabilities(csv)
// Score
fct.score.score(csv).then(d => ...)
Data representation
A CSV ingested by flusight-csv-tools uses the following standards for representing information:
- HHS Regions are referred to as
nat
(for 'US National') orhhs1
,hhs2
... for 'HHS Region 1', 'HHS Region 2' and so on. - Week ahead targets are referred using
1-ahead
,2-ahead
,3-ahead
and4-ahead
while seasonal targets arepeak
(peak wili value),peak-wk
andonset-wk
. - A season 20xx-20yy is represented using a single number 20xx (the first year of a season).
- Weeks values are not represented by themselves but are always passed around as epiweeks like YYYYWW where YYYY is the year and WW is the week (MMWR week).
- An epidemic season 20xx contains all weeks in the set [20xx30, 20yy29], where 20yy = 20xx + 1.
- CSVs are scored using the latest available data and do not, as of yet, use the data available at the time the predictions were made.