-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid overflow when working with powt
#8
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that checking on every operation is the right approach here. I would prefer ensuring that every constructed FD
is of a type that can represent 1
, meaning T(10)^f
would never overflow.
src/FixedPointDecimals.jl
Outdated
quotient, remainder = divrem(x.i, y.i) | ||
reinterpret(FD{T, f}, quotient * powt + round(T, remainder / y.i * powt)) | ||
reinterpret(FD{T, f}, T(widemul(quotient, powt) + round(T, remainder // y.i * powt))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of Rational
here is interesting, but I think more expensive than it needs to be. It looks like we really only need to generalize the _round_to_even
function to accept any number (such as y.i * powt
) instead of only a power of 10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I suppose since we are using widemul
anyway, it could make sense to use the wider rational type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up switching to the Rational
to fix a precision problem. I'll verify that this is necessary as I may have just been running into this issue at the time: JuliaLang/julia#19779
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this should be precise where the original floating point isn't. However, see my PR #9 which achieves the same result but with no possibility of intermediate overflow.
powt = div(powt, g) | ||
den = div(den, g) | ||
reinterpret(FD{T, f}, powt * num) / FD{T, f}(den) | ||
powt = coefficient(FD{T, f}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this change? It seems to me that x * powt
is susceptible to overflow, and the original code was better at avoiding that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed since some FixedDecimal numbers with large values of f
may not be able to represent the den
. The x * powt
seems to be safe from silently overflowing as Rationals will either raise an OverflowError
or promote correctly:
julia> (typemax(Int) // 1) * 2
ERROR: OverflowError()
in *(::Rational{Int64}, ::Rational{Int64}) at ./rational.jl:208
in *(::Rational{Int64}, ::Int64) at ./promotion.jl:191
julia> (typemax(Int) // 1) * BigInt(2)
18446744073709551614//1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep the old code for now, replacing / FD{T, f}(den)
with / den
which avoids the issue with representation of den
? It won't silently overflow, but this new code looks nevertheless susceptible to OverflowError
in more cases than before. We should revisit it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I see the difference now: the old code handled inexact conversions, whereas this one will throw InexactError
. I'm OK with the change to throw InexactError
as it does match what happens if you convert a fixed-decimal with higher precision to one with lower precision; we should preserve the old code as a round
method.
src/FixedPointDecimals.jl
Outdated
The highest value of `x` which does not result in an overflow when evaluating `T(10)^x`. | ||
""" | ||
function max_exp10{T <: Integer}(::Type{T}) | ||
length(digits(typemax(T))) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allocates memory, which we should avoid. There are better way to compute this without overflow.
However I would rather, personally, ensure that 1 is representable in every FD
type. That means requiring typemax(IntType) >= 10^f
for every constructible FD{IntType, f}
type, which would not require excessive overflow checks which I would like to avoid in Currencies. Is there a use case for FixedDecimal
types that cannot represent 1
? (Note that this change almost makes all operations on those types InexactError
s anyway, so I lean toward simply disallowing their construction.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that making sure 1 can always be represented in every FD
type is reasonable. I personally, don't have a use case for this but I my original thought was that FixedDecimal
s purpose is for representing decimal numbers without losing precision so it made sense to support a decimal at any arbitrary precision. Your proposal also implies that we should not allow any value of f < 0
.
As for "excessive overflow checks" we're still going to need to make use of widemul
in quite a few places as we can still run into overflow with large integers:
julia> T = Int; powt = T(10)^2
100
julia> typemax(T) * powt
-100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
widemul
should generally (except in the BigInt
case) be much cheaper than the overflow checks for exponentiation as implemented here, so I'm OK with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll switch the coefficient
function to just be T(10)^f
and add the appropriate restrictions when constructing a FixedDecimal instance.
Codecov Report
@@ Coverage Diff @@
## master #8 +/- ##
==========================================
- Coverage 97.24% 96.85% -0.39%
==========================================
Files 1 1
Lines 145 159 +14
==========================================
+ Hits 141 154 +13
- Misses 4 5 +1
Continue to review full report at Codecov.
|
2 similar comments
@TotalVerb the main driver behind these changes is making sure that |
Here's my stance:
|
To provide some more context, here's a quick benchmark: julia> X = collect(1:100);
julia> Y = collect(1:100);
julia> @btime $X .+ $Y;
56.707 ns (1 allocation: 896 bytes)
julia> @btime Base.Checked.checked_add.($X, $Y);
157.888 ns (1 allocation: 896 bytes)
julia> @btime widemul.($X, $Y);
163.220 ns (1 allocation: 1.77 KiB)
julia> @btime $X .* $Y;
68.801 ns (1 allocation: 896 bytes)
julia> @btime div.($X, $Y);
800.957 ns (1 allocation: 896 bytes)
julia> @btime divrem.($X, $Y);
1.446 μs (1 allocation: 1.77 KiB)
julia> @btime 10 .^ $X;
528.085 ns (1 allocation: 896 bytes)
julia> @btime digits.($X);
5.047 μs (101 allocations: 10.27 KiB) some observations:
|
I'll note that the |
@omus Ah, yeah, that's fine then. We should replace this generated function with a |
When the number of decimal places `f` produces a coefficient `10^f` which is not representable in the container type `T` we can run into overflow issues. Additionally, if the FixedDecimal value is also large we can also run into overflow when working with a representable coefficient. The new function `coefficient` provides us with a representable power of 10 which uses a type which will never overflow when multiplied by the value.
Using an `@eval` in the tests to make it clear what `T` and `f` caused the failure.
Actually, the original code should always work given that `isinteger` doesn't have overflow problems. I updated the code to match the rest of the recent changes but unfortunately it appears impossible to test.
I ended up using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the BigInt
special case. Otherwise, LGTM (after a rebase). Thanks!
src/FixedPointDecimals.jl
Outdated
|
||
The highest value of `x` which does not result in an overflow when evaluating `T(10)^x`. | ||
""" | ||
function max_exp10{T <: Integer}(::Type{T}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do a branch here so that
if !applicable(typemax, T)
return typemax(Int)
end
which would be a little nicer than the current approach in that it will handle arbitrary-precision integers that are not BigInt
properly (e.g. if someone had their own custom implementation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would rather do:
if !applicable(typemax, T)
throw(DomainError())
end
or alternatively
if !applicable(typemax, T)
return -1
end
Returning a large exponent which is guaranteed to overflow doesn't seem like the right answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But typemax(Int)
won't "overflow" for BigInt
or other types that don't define typemax
. It would just simplify the implementation of the constructor so as not to require special-casing BigInt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that T(10)^x
where T is BigInt and x = typemax(Int)
does actually overflow:
julia> BigInt(10)^typemax(Int)
gmp: overflow in mpz type
signal (6): Abort trap: 6
while loading no file, in expression starting on line 0
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 3959986 (Pool: 3958822; Big: 1164); GC: 25
Abort trap: 6
I agree with your point that the FixedDecimal constructor shouldn't have the BigInt special case so I'll try to generalize the code in a different way.
I'll push the changes and let you review before I rebase. |
Using -1 to indicate that the given type should not overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks!
I'm going to do a merge instead of our typical rebase. |
Several of the
FixedDecimal{T,f}
functions usedT(10)^f
to compute10ᶠ
. This calculation can easily run into overflow especially if the storage typeT
is small and number of places to storef
is large.For example, given
T = Int8
which has atypemax(Int8) = 127
andf = 3
if we attempt to computepowt = T(10)^f
we end up with-24
instead of1000
. Using thepowt
value during calculations can cause the resulting FixedDecimal to be incorrect which is very bad for library whose purpose is to have more accurate calculations than floating-point.I've revised the code to make use of a new function
coefficient(::Type{FD{T,f}})
which calculatespowt
without ever overflowing. Additionally, since none of these scenarios were previously tested I added several new testsets to ensure the overflow issues are not reintroduced. Finally, these changes found some additional problems regarding FixedDecimal division which no longer uses an intermediate floating-point number.