Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Nullables as collections #16961

Merged
merged 10 commits into from
Dec 29, 2016
Merged
17 changes: 15 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ This section lists changes that do not have deprecation warnings.

* `broadcast` now treats `Ref` (except for `Ptr`) arguments as 0-dimensional
arrays ([#18965]).
* `broadcast` now handles missing data (`Nullable`s) allowing operations to
be lifted over `Nullable`s, as if the `Nullable` were like an array with
zero or one element. ([#16961]).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a breaking change or just a new feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a breaking change, because Nullables that were treated like scalars in some cases before are now treated like collections. To keep Nullables like scalars you need to use the Ref(x) or (x,) trick now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe say that here to point out what about it is breaking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that v0.5 did not support treating nullables as scalars (treating arbitrary things as scalars itself was a 0.6 addition). But there is another thing that would break; get.(xs) will now try to descend into the Nullable instead of operating on the Nullable itself. I added that as an example.


* The runtime now enforces when new method definitions can take effect ([#17057]).
The flip-side of this is that new method definitions should now reliably actually
Expand Down Expand Up @@ -109,6 +112,10 @@ Library improvements

* Additional methods for `ones` and `zeros` functions to support the same signature as the `similar` function ([#19635]).

* Methods for `map` and `filter` with `Nullable` arguments have been
implemented; the semantics are as if the `Nullable` were a container with
zero or one elements ([#16961]).

Compiler/Runtime improvements
-----------------------------

Expand Down Expand Up @@ -620,6 +627,7 @@ Language tooling improvements
calling C++ code from Julia.

<!--- generated by NEWS-update.jl: -->
[#265]: https://github.com/JuliaLang/julia/issues/265
[#550]: https://github.com/JuliaLang/julia/issues/550
[#964]: https://github.com/JuliaLang/julia/issues/964
[#1090]: https://github.com/JuliaLang/julia/issues/1090
Expand Down Expand Up @@ -731,10 +739,12 @@ Language tooling improvements
[#16731]: https://github.com/JuliaLang/julia/issues/16731
[#16854]: https://github.com/JuliaLang/julia/issues/16854
[#16953]: https://github.com/JuliaLang/julia/issues/16953
[#16961]: https://github.com/JuliaLang/julia/issues/16961
[#16972]: https://github.com/JuliaLang/julia/issues/16972
[#16986]: https://github.com/JuliaLang/julia/issues/16986
[#17033]: https://github.com/JuliaLang/julia/issues/17033
[#17037]: https://github.com/JuliaLang/julia/issues/17037
[#17057]: https://github.com/JuliaLang/julia/issues/17057
[#17075]: https://github.com/JuliaLang/julia/issues/17075
[#17132]: https://github.com/JuliaLang/julia/issues/17132
[#17261]: https://github.com/JuliaLang/julia/issues/17261
Expand All @@ -748,17 +758,17 @@ Language tooling improvements
[#17510]: https://github.com/JuliaLang/julia/issues/17510
[#17546]: https://github.com/JuliaLang/julia/issues/17546
[#17599]: https://github.com/JuliaLang/julia/issues/17599
[#17623]: https://github.com/JuliaLang/julia/issues/17623
[#17668]: https://github.com/JuliaLang/julia/issues/17668
[#17758]: https://github.com/JuliaLang/julia/issues/17758
[#17785]: https://github.com/JuliaLang/julia/issues/17785
[#18330]: https://github.com/JuliaLang/julia/issues/18330
[#18339]: https://github.com/JuliaLang/julia/issues/18339
[#18346]: https://github.com/JuliaLang/julia/issues/18346
[#18442]: https://github.com/JuliaLang/julia/issues/18442
[#18473]: https://github.com/JuliaLang/julia/issues/18473
[#18628]: https://github.com/JuliaLang/julia/issues/18628
[#18644]: https://github.com/JuliaLang/julia/issues/18644
[#18690]: https://github.com/JuliaLang/julia/issues/18690
[#18628]: https://github.com/JuliaLang/julia/issues/18628
[#18839]: https://github.com/JuliaLang/julia/issues/18839
[#18931]: https://github.com/JuliaLang/julia/issues/18931
[#18965]: https://github.com/JuliaLang/julia/issues/18965
Expand All @@ -768,3 +778,6 @@ Language tooling improvements
[#19288]: https://github.com/JuliaLang/julia/issues/19288
[#19305]: https://github.com/JuliaLang/julia/issues/19305
[#19469]: https://github.com/JuliaLang/julia/issues/19469
[#19543]: https://github.com/JuliaLang/julia/issues/19543
[#19598]: https://github.com/JuliaLang/julia/issues/19598
[#19635]: https://github.com/JuliaLang/julia/issues/19635
124 changes: 106 additions & 18 deletions base/broadcast.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,28 @@ module Broadcast

using Base.Cartesian
using Base: promote_eltype_op, linearindices, tail, OneTo, to_shape,
_msk_end, unsafe_bitgetindex, bitcache_chunks, bitcache_size, dumpbitcache
_msk_end, unsafe_bitgetindex, bitcache_chunks, bitcache_size, dumpbitcache,
nullable_returntype, null_safe_eltype_op, hasvalue, is_nullable_array
import Base: broadcast, broadcast!
export bitbroadcast, dotview
export broadcast_getindex, broadcast_setindex!

## Broadcasting utilities ##

broadcast_array_type() = Array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a more general mechanism for this? Cc: @pabloferz @martinholters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used as a marker to select whether to recurse or otherwise. Perhaps that's best done with a different trait.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new subtyping algorithm gets merged before, I believe you could remove this and have

@inline broadcast(f, As::AbstractArray...) = broadcast_c(f, Array, As...)
@inline broadcast(f, As::A...) where A<:AbstractArray{E} where E<: Nullable = broadcast_c(f, Array{Nullable}, As...)
containertype{T<:AbstractArray}(::Type{T}) = Array
containertype{E<:Nullable,T<:AbstractArray{E}}(::Type{T}) = Array{Nullable}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a more generally useful design indeed. It's likely that these type system improvements will be merged before or at the same time as this PR, since both need to be in before the feature freeze. Though for now let's keep the custom version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to avoid changing this for now because the subtyping algorithm will likely provide a better way to do it.

broadcast_array_type(A, As...) =
if is_nullable_array(A) || broadcast_array_type(As...) === Array{Nullable}
Array{Nullable}
else
Array
end

# fallbacks for some special cases
broadcast(f) = f()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is defined in sysimg.jl now and isn't needed here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, must have missed that in the rebase.

@inline broadcast(f, x::Number...) = f(x...)
@inline broadcast{N}(f, t::NTuple{N}, ts::Vararg{NTuple{N}}) = map(f, t, ts...)
@inline broadcast(f, As::AbstractArray...) = broadcast_c(f, Array, As...)
@inline broadcast(f, As::AbstractArray...) =
broadcast_c(f, broadcast_array_type(As...), As...)

# special cases for "X .= ..." (broadcast!) assignments
broadcast!(::typeof(identity), X::AbstractArray, x::Number) = fill!(X, x)
Expand All @@ -30,7 +41,9 @@ containertype(::Type) = Any
containertype{T<:Ptr}(::Type{T}) = Any
containertype{T<:Tuple}(::Type{T}) = Tuple
containertype{T<:Ref}(::Type{T}) = Array
containertype{T<:AbstractArray}(::Type{T}) = Array
containertype{T<:AbstractArray}(::Type{T}) =
is_nullable_array(T) ? Array{Nullable} : Array
containertype{T<:Nullable}(::Type{T}) = Nullable
containertype(ct1, ct2) = promote_containertype(containertype(ct1), containertype(ct2))
@inline containertype(ct1, ct2, cts...) = promote_containertype(containertype(ct1), containertype(ct2, cts...))

Expand All @@ -39,16 +52,26 @@ promote_containertype(::Type{Array}, ct) = Array
promote_containertype(ct, ::Type{Array}) = Array
promote_containertype(::Type{Tuple}, ::Type{Any}) = Tuple
promote_containertype(::Type{Any}, ::Type{Tuple}) = Tuple
promote_containertype(::Type{Any}, ::Type{Nullable}) = Nullable
promote_containertype(::Type{Nullable}, ::Type{Any}) = Nullable
promote_containertype(::Type{Nullable}, ::Type{Array}) = Array{Nullable}
promote_containertype(::Type{Array}, ::Type{Nullable}) = Array{Nullable}
promote_containertype(::Type{Array{Nullable}}, ::Type{Array{Nullable}}) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this one covered by the ::Type{T} method below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to avoid an ambiguity.

Array{Nullable}
promote_containertype(::Type{Array{Nullable}}, ::Type{Array}) = Array{Nullable}
promote_containertype(::Type{Array}, ::Type{Array{Nullable}}) = Array{Nullable}
promote_containertype(::Type{Array{Nullable}}, ct) = Array{Nullable}
promote_containertype(ct, ::Type{Array{Nullable}}) = Array{Nullable}
promote_containertype{T}(::Type{T}, ::Type{T}) = T

## Calculate the broadcast indices of the arguments, or error if incompatible
# array inputs
broadcast_indices() = ()
broadcast_indices(A) = broadcast_indices(containertype(A), A)
broadcast_indices(::Type{Any}, A) = ()
broadcast_indices(::Union{Type{Any}, Type{Nullable}}, A) = ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, Nullable is treated as any scalar here. So why do we need a specific method? Same below for Array{Nullable} and _broadcast_getindex.

Copy link
Contributor Author

@TotalVerb TotalVerb Dec 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type information is computed assuming Nullable is a scalar, but that information is not fully propagated through. Perhaps it should be?

Copy link
Member

@nalimilan nalimilan Dec 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this means, I'm not really familiar with that code. @martinholters @pabloferz?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that nullable is only treated as a scalar as a special case.

broadcast_indices(::Type{Tuple}, A) = (OneTo(length(A)),)
broadcast_indices(::Type{Array}, A) = indices(A)
broadcast_indices(::Type{Array}, A::Ref) = ()
broadcast_indices{T<:Array}(::Type{T}, A) = indices(A)
@inline broadcast_indices(A, B...) = broadcast_shape((), broadcast_indices(A), map(broadcast_indices, B)...)
# shape (i.e., tuple-of-indices) inputs
broadcast_shape(shape::Tuple) = shape
Expand Down Expand Up @@ -123,6 +146,8 @@ end
Base.@propagate_inbounds _broadcast_getindex(A, I) = _broadcast_getindex(containertype(A), A, I)
Base.@propagate_inbounds _broadcast_getindex(::Type{Array}, A::Ref, I) = A[]
Base.@propagate_inbounds _broadcast_getindex(::Type{Any}, A, I) = A
Base.@propagate_inbounds _broadcast_getindex(::Union{Type{Any},
Type{Nullable}}, A, I) = A
Base.@propagate_inbounds _broadcast_getindex(::Any, A, I) = A[I]

## Broadcasting core
Expand Down Expand Up @@ -272,19 +297,28 @@ end
ftype(f, A) = typeof(f)
ftype(f, A...) = typeof(a -> f(a...))
ftype(T::Type, A...) = Type{T}
typestuple(a) = (Base.@_pure_meta; Tuple{eltype(a)})
typestuple(T::Type) = (Base.@_pure_meta; Tuple{Type{T}})
typestuple(a, b...) = (Base.@_pure_meta; Tuple{typestuple(a).types..., typestuple(b...).types...})
ziptype(A) = typestuple(A)
ziptype(A, B) = (Base.@_pure_meta; Iterators.Zip2{typestuple(A), typestuple(B)})
@inline ziptype(A, B, C, D...) = Iterators.Zip{typestuple(A), ziptype(B, C, D...)}

_broadcast_type(f, T::Type, As...) = Base._return_type(f, typestuple(T, As...))
_broadcast_type(f, A, Bs...) = Base._default_eltype(Base.Generator{ziptype(A, Bs...), ftype(f, A, Bs...)})
# nullables need to be treated like scalars sometimes and like containers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The occurrence of "sometimes" here definitely raises a red flag.

# other times, so there are two variants of typestuple.
immutable NullableIsScalar end
immutable NullableIsContainer end

typestuple(::Any, a) = (Base.@_pure_meta; Tuple{eltype(a)})
typestuple(::NullableIsScalar, a::Nullable) = (Base.@_pure_meta; Tuple{typeof(a)})
Copy link
Contributor

@pabloferz pabloferz Dec 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this solution much better 👍, but I don't think you need to introduce a new type here, you could just use Array and Any. That is, we could have typestuple(::Type{Array}, a::Nullable) = (Base.@_pure_meta; Tuple{typeof(a)}) and use _broadcast_type(Array, f, A, Bs...) or _broadcast_type(Any, f, A, Bs...) were appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

typestuple(::Any, T::Type) = (Base.@_pure_meta; Tuple{Type{T}})
typestuple(g, a, b...) = (Base.@_pure_meta; Tuple{typestuple(g, a).types..., typestuple(g, b...).types...})

# these functions take the variant of typestuple to be used as first argument
ziptype(g, A) = typestuple(g, A)
ziptype(g, A, B) = (Base.@_pure_meta; Iterators.Zip2{typestuple(g, A), typestuple(g, B)})
@inline ziptype(g, A, B, C, D...) = Iterators.Zip{typestuple(g, A), ziptype(g, B, C, D...)}

_broadcast_type(g, f, T::Type, As...) = Base._return_type(f, typestuple(g, T, As...))
_broadcast_type(g, f, A, Bs...) = Base._default_eltype(Base.Generator{ziptype(g, A, Bs...), ftype(f, A, Bs...)})

# broadcast methods that dispatch on the type of the final container
@inline function broadcast_c(f, ::Type{Array}, A, Bs...)
T = _broadcast_type(f, A, Bs...)
T = _broadcast_type(NullableIsScalar(), f, A, Bs...)
shape = broadcast_indices(A, Bs...)
iter = CartesianRange(shape)
if isleaftype(T)
Expand All @@ -295,27 +329,59 @@ _broadcast_type(f, A, Bs...) = Base._default_eltype(Base.Generator{ziptype(A, Bs
end
return broadcast_t(f, Any, shape, iter, A, Bs...)
end
@inline function broadcast_c(f, ::Type{Array{Nullable}}, A, Bs...)
@inline rec(x) = broadcast(f, x)
@inline rec(x, y) = broadcast(f, x, y)
@inline rec(x, y, z) = broadcast(f, x, y, z)
@inline rec(xs...) = broadcast(f, xs...)
broadcast_c(rec, Array, A, Bs...)
end
function broadcast_c(f, ::Type{Tuple}, As...)
shape = broadcast_indices(As...)
n = length(shape[1])
return ntuple(k->f((_broadcast_getindex(A, k) for A in As)...), n)
end
@inline function broadcast_c(f, ::Type{Nullable}, a...)
nonnull = all(hasvalue, a)
S = _broadcast_type(NullableIsContainer(), f, a...)
if isleaftype(S) && null_safe_eltype_op(f, a...)
Nullable{S}(f(map(unsafe_get, a)...), nonnull)
else
if nonnull
Nullable(f(map(unsafe_get, a)...))
else
Nullable{nullable_returntype(S)}()
end
end
end
@inline broadcast_c(f, ::Type{Any}, a...) = f(a...)

"""
broadcast(f, As...)

Broadcasts the arrays, tuples, `Ref` and/or scalars `As` to a container of the
appropriate type and dimensions. In this context, anything that is not a
subtype of `AbstractArray`, `Ref` (except for `Ptr`s) or `Tuple` is considered
a scalar. The resulting container is established by the following rules:
Broadcasts the arrays, tuples, `Ref`, nullables, and/or scalars `As` to a
container of the appropriate type and dimensions. In this context, anything
that is not a subtype of `AbstractArray`, `Ref` (except for `Ptr`s) or `Tuple`,
or `Nullable` is considered a scalar. The resulting container is established by
the following rules:

- If all the arguments are scalars, it returns a scalar.
- If the arguments are tuples and zero or more scalars, it returns a tuple.
- If there is at least an array or a `Ref` in the arguments, it returns an array
(and treats any `Ref` as a 0-dimensional array of its contents and any tuple
as a 1-dimensional array) expanding singleton dimensions.

The following additional rules apply to `Nullable` arguments:

- If there is at least a `Nullable`, and all the arguments are scalars or
`Nullable`, it returns a `Nullable`.
- If there is at least an array or a `Ref` with `Nullable` entries, or there
is at least an array or a `Ref` (perhaps with scalar entries instead of
`Nullable` entries) and a nullable, then the result is an array of
`Nullable` entries.
- If there is a tuple and a nullable, the result is an error, as this case is
not currently supported.

A special syntax exists for broadcasting: `f.(args...)` is equivalent to
`broadcast(f, args...)`, and nested `f.(g.(args...))` calls are fused into a
single broadcast loop.
Expand Down Expand Up @@ -372,6 +438,28 @@ julia> string.(("one","two","three","four"), ": ", 1:4)
"two: 2"
"three: 3"
"four: 4"

julia> Nullable("X") .* "Y"
Nullable{String}("XY")

julia> broadcast(/, 1.0, Nullable(2.0))
Nullable{Float64}(0.5)

julia> [Nullable(1), Nullable(2), Nullable()] .* 3
3-element Array{Nullable{Int64},1}:
3
6
#NULL

julia> [1+im, 2+2im, 3+3im] ./ Nullable{Int}()
3-element Array{Nullable{Complex{Float64}},1}:
#NULL
#NULL
#NULL

julia> Ref(7) .+ Nullable(3)
0-dimensional Array{Nullable{Int64},0}:
10
```
"""
@inline broadcast(f, A, Bs...) = broadcast_c(f, containertype(A, Bs...), A, Bs...)
Expand Down
Loading