Skip to content
This repository has been archived by the owner on May 4, 2019. It is now read-only.

Mapreduce performance #35

Open
Open
@davidagold

Description

Here are my results from running the profiling methods included in https://github.com/johnmyleswhite/NullableArrays.jl/blob/master/perf/mapreduce.jl:

julia> profile_all()
Method: mapreduce(f, op, A)
  for Array{Float64}:           246.070 milliseconds (15008 k allocations: 229 MB, 9.91% gc time)
  for NullableArray{Float64}:   417.401 milliseconds (15008 k allocations: 458 MB, 15.67% gc time)
  for DataArray{Float64}:       278.743 milliseconds (15008 k allocations: 229 MB, 9.67% gc time)
Method: reduce(op, A)
  for Array{Float64}:             2.629 milliseconds (1 allocation: 16 bytes)
  for NullableArray{Float64}:     5.286 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         2.851 milliseconds (2 allocations: 96 bytes)
Method: sum(A)
  for Array{Float64}:             2.810 milliseconds
  for NullableArray{Float64}:     6.432 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         5.209 milliseconds (2 allocations: 96 bytes)
Method: sum(f, A)
  for Array{Float64}:           252.220 milliseconds (15008 k allocations: 229 MB, 10.83% gc time)
  for NullableArray{Float64}:   424.200 milliseconds (15008 k allocations: 458 MB, 15.71% gc time)
  for DataArray{Float64}:       269.271 milliseconds (15008 k allocations: 229 MB, 9.69% gc time)
Method: prod(A)
  for Array{Float64}:             8.409 milliseconds
  for NullableArray{Float64}:    23.911 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         8.767 milliseconds (2 allocations: 96 bytes)
Method: prod(f, A)
  for Array{Float64}:           255.930 milliseconds (15008 k allocations: 229 MB, 10.65% gc time)
  for NullableArray{Float64}:       410 milliseconds (15008 k allocations: 458 MB, 16.58% gc time)
  for DataArray{Float64}:       272.405 milliseconds (15008 k allocations: 229 MB, 9.38% gc time)
Method: minimum(A)
  for Array{Float64}:             3.702 milliseconds
  for NullableArray{Float64}:    33.607 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         6.642 milliseconds (2 allocations: 96 bytes)
Method: minimum(f, A)
  for Array{Float64}:           238.662 milliseconds (10000 k allocations: 153 MB, 7.49% gc time)
  for NullableArray{Float64}:   517.422 milliseconds (15000 k allocations: 381 MB, 13.49% gc time)
  for DataArray{Float64}:       254.674 milliseconds (10000 k allocations: 153 MB, 7.48% gc time)
Method: maximum(A)
  for Array{Float64}:             7.461 milliseconds
  for NullableArray{Float64}:    24.961 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         6.270 milliseconds (2 allocations: 96 bytes)
Method: maximum(f, A)
  for Array{Float64}:           239.670 milliseconds (10000 k allocations: 153 MB, 8.19% gc time)
  for NullableArray{Float64}:   511.292 milliseconds (15000 k allocations: 381 MB, 14.05% gc time)
  for DataArray{Float64}:       245.280 milliseconds (10000 k allocations: 153 MB, 8.63% gc time)
Method: sumabs(A)
  for Array{Float64}:             2.756 milliseconds
  for NullableArray{Float64}:    28.615 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         2.654 milliseconds (2 allocations: 96 bytes)
Method: sumabs2(A)
  for Array{Float64}:             3.294 milliseconds
  for NullableArray{Float64}:    28.927 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         3.470 milliseconds (2 allocations: 96 bytes)

julia> profile_skip(true)
Comparison of skipnull/skipNA methods

f := IdFun(), op := AddFun()
mapreduce(f, op, X; skipnull/skipNA=true) (0 missing entries)
  for NullableArray{Float64}:     8.569 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         2.741 milliseconds (2 allocations: 96 bytes)

reduce(op, X; skipnull/skipNA=true) (0 missing entries)
  for NullableArray{Float64}:     9.563 milliseconds (3 allocations: 192 bytes)
  for DataArray{Float64}:         2.736 milliseconds (3 allocations: 176 bytes)

mapreduce(f, op, X; skipnull/skipNA=true) (~half missing entries)
  for NullableArray{Float64}:    52.518 milliseconds (2 allocations: 96 bytes)
  for DataArray{Float64}:        53.641 milliseconds (2 allocations: 96 bytes)

reduce(op, X; skipnull/skipNA=true) (~half missing entries)
  for NullableArray{Float64}:    54.764 milliseconds (3 allocations: 176 bytes)
  for DataArray{Float64}:        51.647 milliseconds (3 allocations: 176 bytes)

julia> profile_skip(false)
Comparison of skipnull/skipNA methods

f := IdFun(), op := AddFun()
mapreduce(f, op, X; skipnull/skipNA=false) (0 missing entries)
  for NullableArray{Float64}:     5.855 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:         2.911 milliseconds (2 allocations: 96 bytes)

reduce(op, X; skipnull/skipNA=false) (0 missing entries)
  for NullableArray{Float64}:     9.874 milliseconds (3 allocations: 192 bytes)
  for DataArray{Float64}:         5.123 milliseconds (3 allocations: 176 bytes)

mapreduce(f, op, X; skipnull/skipNA=false) (~half missing entries)
  for NullableArray{Float64}:     5.486 milliseconds (2 allocations: 112 bytes)
  for DataArray{Float64}:           753 nanoseconds  (1 allocation: 80 bytes)

reduce(op, X; skipnull/skipNA=false) (~half missing entries)
  for NullableArray{Float64}:     6.782 milliseconds (3 allocations: 192 bytes)
  for DataArray{Float64}:        50.259 milliseconds (3 allocations: 176 bytes)

julia> profile_skip_impl()
Comparison of internal skip methods:
mapreduce_impl_skipnull(f, op, X) (0 missing entries)
  for NullableArray{Float64}:     7.470 milliseconds (1 allocation: 16 bytes)
  for DataArray{Float64}:         3.644 milliseconds (1 allocation: 16 bytes)

mapreduce_impl_skipnull(f, op, X) (~half missing entries)
  for NullableArray{Float64}:    29.196 milliseconds (1 allocation: 16 bytes)
  for DataArray{Float64}:        51.622 milliseconds (1 allocation: 16 bytes)

Though more than for broadcast, there is still relatively little specialized code for mapreduce other specialized methods for skipping over null entries and hooking them into the general mapreduce interface. A few noteworthy items:

  1. sumabs and sumabs2 are 10x faster than they were last week without my having touched them. I suspect this is due to improvements in Base Julia, possibly to do with codegen? Now I'm beginning to understand the importance of rigorously tracking performance against environmental variables, since it would have been interesting to see what caused the speedup.
  2. The internal skipnull method is twice as fast for NullableArrays as it is for DataArrays, but that speedup is lost by the time get to the exposed interface.
  3. Again, we see higher allocations in general for NullableArrays.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions