Reasonably fast running quantiles with NaN handling
result = running_quantile(v, p, w, nan_mode=SkipNaNs())
computes the running p
-th quantile of v
with window w
, where w
is an odd window length, or a range of offsets.
Specifically,
- if
w
is aAbstractUnitRange
,result[i]
is thep
-th quantile ofv[(i .+ w) ∩ eachindex(v)]
, whereNaN
s are handled according tonan_mode
:nan_mode==SkipNaN()
:NaN
values are ignored; quantile is computed over non-NaN
snan_mode==PropagateNaNs()
: the result isNaN
whenever the input window containsNaN
nan_mode==ErrOnNaN()
: an error is raise if at least one input window containsNaN
- if
w
is an odd integer, a centered window of lengthw
is used, namely-w÷2:w÷2
running_median(v, w, nan_mode=SkipNaNs())
computes the running median, i.e. 1/2-th quantile, as above.
These two packages also implement running quantiles/medians, but do not handle NaN
s (output is garbage when NaN
s are present):
- SortFilters.jl is faster for small window sizes.
- FastRunningMedian.jl is faster for all window size, but only supports median, rather than arbitrary quantiles. It also offers more options for handling of edges.
These package handle the edges and the correspondence of input to output indices differently; please refer to their respective documentation for details.
The most versatile alternative, in terms of options for edge padding and handling of NaN
values, is probably ImageFiltering.mapwindow. But it is not specialized for quantiles, and is therefore a much slower option.
Benchmarks for running median on a random vector of length 100_000
:
Shaded areas indicate standard deviation. The input vector has no NaN
s. Performance of this package in the presence of NaN
s is generally faster, roughly proportional to the number of non-NaN
s (the other two packages do not handle NaN
values correctly).
julia> v = [1:3; fill(NaN,3); 1:5]
11-element Vector{Float64}:
1.0
2.0
3.0
NaN
NaN
NaN
1.0
2.0
3.0
4.0
5.0
julia> running_median(v, 3)
11-element Vector{Float64}:
1.5
2.0
2.5
3.0
NaN
1.0
1.5
2.0
3.0
4.0
4.5
julia> running_median(v, 3, PropagateNaNs())
11-element Vector{Float64}:
1.5
2.0
NaN
NaN
NaN
NaN
NaN
2.0
3.0
4.0
4.5
julia> running_median(v, -3:5) # specifying a non-centered window
11-element Vector{Float64}:
2.0
1.5
2.0
2.0
2.5
3.0
3.0
3.0
3.0
3.0
3.5