smoothish -- Smoothing out time-series data with boundary and missing data handling
The Smoothish JavaScript library provides variations of centered moving average functions that are robust to missing data and don't drop points at the beginning and end boundary.
Installation and import
Installation:
npm install smoothish
Import (classic):
const smoothish =
or (modern):
Basic Usage
Consider the following time series of twelve values:
// Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dexconst daysPerMonth = 31 28 31 30 31 30 31 31 30 31 30 31
This makes for a rather jagged graph:
We can apply the smoothish
function to smooth out the data:
const smoothed = // --> [ 30.1, 29.7, 30.1, 30.3, 30.4, 30.5, 30.6, 30.6, 30.5, 30.6, 30.5, 30.7 ]
Handling missing data
Consider an array of data that has some missing data:
const incompleteDaysPerMonth = 31 28 undefined 30 31 null 31 31 null 31 30 31
The smoothing function bridges over missing points
const smoothedIncomplete = // --> [ 30.0, 29.4, 29.8, 30.1, 30.5, 30.6, 30.8, 30.8, 30.8, 30.7, 30.6, 30.7 ]
Or here's another example of a linear increasing set of numbers with some missing.
const linear = undefined 200 300 undefined 500 undefined 700 800 900
The smoothish function fills in the interior missing data points, though note that it does not extrapolate missing values at the beginning and end.
const smoothedLinear = // --> [ undefined, 200, 300, 400, 500, 600, 700, 800, 900 ]
Changing the radius
All of the above examples use the default radius of 2, which means the smoothing is similar to a 5-point moving average (using a neighborhood that includes the center point and two points on either side).
We can specify a different value of the radius in an in an optional second argument to smoothish
.
This is best seen with a step function that abruptly changes value:
const stepFunction = 100 100 100 100 100 200 200 200 200 200
Setting the radius to 1
produces a some smoothing:
const radius1 = // --> [ 98.6, 99.9, 102, 109, 126, 174, 191, 198, 200, 201 ]
Increasing the radius to 2
(the default) increases the smoothing.
const radius2 = // --> [ 91.6, 98.1, 106, 118, 136, 164, 182, 194, 202, 208 ]
And increasing the radius to 3
increases the smoothing more.
const radius3 = // --> [ 87.5, 97.0, 108, 121, 138, 162, 179, 192, 203, 212 ]
The Least-Squares algorithm
Bu default smoothish
uses a a least-squares linear interpolation for each point using the values of neighboring points, and replaces each point with the interpolated point.
Exponential falloff
By default "neighboring points" are all points, but with the ones closer having more weight with an exponential decay in both directions with a time constant of radius
.
Using different algorithms and falloffs
As an alternative to the least-squares based smoothing, you can have smoothish
the moving-average smoothing by adding a algorithm: 'movingAverage'
property to the optional second parameter.
And as an alternative to the exponential falloff you can set falloff: 'step'
to include only the points within radius
and to have them equally weighted.
So for example to get a standard five-point moving average, you can use the following. (A radius
of 2
means that 2 previous, 2 following, and the current point are included, giving a total of five points being averaged for each point.) )
const movingAverage = // --> [ 30.0, 30.0, 30.2, 30.0, 30.6, 30.6, 30.6, 30.6, 30.6, 30.6, 30.5, 30.7 ]
Note that this is a centered (not lagging) moving average.
Also note that it produces as many output points as there are input points. That means that there is special handling of the boundaries. So in the above example 5 points are averaged for each interior point, but only 3 points are averaged at the end points.
If you really want to match a standard moving average exactly you would need to lop off radius
points from each end of the result:
const strictMovingAverage = movingAverage// --> [ 30.2, 30.0, 30.6, 30.6, 30.6, 30.6, 30.6, 30.6 ]
Note for many cases the default algorithm: 'leastSquares'
gives better results than algorithm: 'movingAverage'
. For example see how the smoothing of the incomplete linear data below is worse than than the straight line produces by the default algorithm above.
const movingAverageLinear = // --> [ undefined, 250, 333, 333, 500, 667, 725, 800, 800 ]
More Details
See also API docs.
The tests and its snapshots also have examples.
Legal
Copyright (c) 2020 Eamonn O'Brien-Strain All rights reserved. This program and the accompanying materials are made available under the terms of the Eclipse Public License v1.0 which accompanies this distribution, and is available at http://www.eclipse.org/legal/epl-v10.html
This is a purely personal project, not a project of my employer.