Avoid overflow when working with `powt` #8

omus · 2017-05-23T20:29:33Z

Several of the FixedDecimal{T,f} functions used T(10)^f to compute 10ᶠ. This calculation can easily run into overflow especially if the storage type T is small and number of places to store f is large.

For example, given T = Int8 which has a typemax(Int8) = 127 and f = 3 if we attempt to compute powt = T(10)^f we end up with -24 instead of 1000. Using the powt value during calculations can cause the resulting FixedDecimal to be incorrect which is very bad for library whose purpose is to have more accurate calculations than floating-point.

I've revised the code to make use of a new function coefficient(::Type{FD{T,f}}) which calculates powt without ever overflowing. Additionally, since none of these scenarios were previously tested I added several new testsets to ensure the overflow issues are not reintroduced. Finally, these changes found some additional problems regarding FixedDecimal division which no longer uses an intermediate floating-point number.

TotalVerb

I'm not convinced that checking on every operation is the right approach here. I would prefer ensuring that every constructed FD is of a type that can represent 1, meaning T(10)^f would never overflow.

TotalVerb · 2017-05-23T20:35:49Z

src/FixedPointDecimals.jl

    quotient, remainder = divrem(x.i, y.i)
-    reinterpret(FD{T, f}, quotient * powt + round(T, remainder / y.i * powt))
+    reinterpret(FD{T, f}, T(widemul(quotient, powt) + round(T, remainder // y.i * powt)))


The use of Rational here is interesting, but I think more expensive than it needs to be. It looks like we really only need to generalize the _round_to_even function to accept any number (such as y.i * powt) instead of only a power of 10.

Well, I suppose since we are using widemul anyway, it could make sense to use the wider rational type.

I ended up switching to the Rational to fix a precision problem. I'll verify that this is necessary as I may have just been running into this issue at the time: JuliaLang/julia#19779

Right, this should be precise where the original floating point isn't. However, see my PR #9 which achieves the same result but with no possibility of intermediate overflow.

TotalVerb · 2017-05-23T20:38:08Z

src/FixedPointDecimals.jl

-    powt = div(powt, g)
-    den = div(den, g)
-    reinterpret(FD{T, f}, powt * num) / FD{T, f}(den)
+    powt = coefficient(FD{T, f})


Could you explain this change? It seems to me that x * powt is susceptible to overflow, and the original code was better at avoiding that.

I changed since some FixedDecimal numbers with large values of f may not be able to represent the den. The x * powt seems to be safe from silently overflowing as Rationals will either raise an OverflowError or promote correctly:

julia> (typemax(Int) // 1) * 2 ERROR: OverflowError() in *(::Rational{Int64}, ::Rational{Int64}) at ./rational.jl:208 in *(::Rational{Int64}, ::Int64) at ./promotion.jl:191 julia> (typemax(Int) // 1) * BigInt(2) 18446744073709551614//1

Can we keep the old code for now, replacing / FD{T, f}(den) with / den which avoids the issue with representation of den? It won't silently overflow, but this new code looks nevertheless susceptible to OverflowError in more cases than before. We should revisit it later.

Actually, I see the difference now: the old code handled inexact conversions, whereas this one will throw InexactError. I'm OK with the change to throw InexactError as it does match what happens if you convert a fixed-decimal with higher precision to one with lower precision; we should preserve the old code as a round method.

TotalVerb · 2017-05-23T20:44:29Z

src/FixedPointDecimals.jl

+The highest value of `x` which does not result in an overflow when evaluating `T(10)^x`.
+"""
+function max_exp10{T <: Integer}(::Type{T})
+    length(digits(typemax(T))) - 1


This allocates memory, which we should avoid. There are better way to compute this without overflow.

However I would rather, personally, ensure that 1 is representable in every FD type. That means requiring typemax(IntType) >= 10^f for every constructible FD{IntType, f} type, which would not require excessive overflow checks which I would like to avoid in Currencies. Is there a use case for FixedDecimal types that cannot represent 1? (Note that this change almost makes all operations on those types InexactErrors anyway, so I lean toward simply disallowing their construction.)

I agree that making sure 1 can always be represented in every FD type is reasonable. I personally, don't have a use case for this but I my original thought was that FixedDecimals purpose is for representing decimal numbers without losing precision so it made sense to support a decimal at any arbitrary precision. Your proposal also implies that we should not allow any value of f < 0.

As for "excessive overflow checks" we're still going to need to make use of widemul in quite a few places as we can still run into overflow with large integers:

julia> T = Int; powt = T(10)^2 100 julia> typemax(T) * powt -100

widemul should generally (except in the BigInt case) be much cheaper than the overflow checks for exponentiation as implemented here, so I'm OK with that.

Ok, I'll switch the coefficient function to just be T(10)^f and add the appropriate restrictions when constructing a FixedDecimal instance.

codecov-io · 2017-05-23T21:28:05Z

Codecov Report

Merging #8 into master will decrease coverage by 0.38%.
The diff coverage is 97.91%.

@@            Coverage Diff             @@
##           master       #8      +/-   ##
==========================================
- Coverage   97.24%   96.85%   -0.39%     
==========================================
  Files           1        1              
  Lines         145      159      +14     
==========================================
+ Hits          141      154      +13     
- Misses          4        5       +1

Impacted Files	Coverage Δ
src/FixedPointDecimals.jl	`96.85% <97.91%> (-0.39%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4dd5492...d927bdb. Read the comment docs.

coveralls · 2017-05-23T21:41:08Z

Coverage increased (+0.3%) to 96.694% when pulling dc48ad3 on cv/limits into 9627c91 on master.

coveralls · 2017-05-23T21:41:08Z

Coverage increased (+0.3%) to 96.694% when pulling dc48ad3 on cv/limits into 9627c91 on master.

coveralls · 2017-05-23T21:41:08Z

Coverage increased (+0.3%) to 96.694% when pulling dc48ad3 on cv/limits into 9627c91 on master.

omus · 2017-05-23T22:05:43Z

@TotalVerb the main driver behind these changes is making sure that FixedDecimal arithmetic doesn't result in silent overflow. Do you agree that the FixedPointDecimals package should be robust against issues like overflow or implicit loss of precision?

TotalVerb · 2017-05-23T22:10:31Z

Here's my stance:

Addition and subtraction of FixedDecimals is currently no more expensive than of integers, which is a property that Currencies does really want. So I would be against overflow checking here.
On the other hand, I'm fine with using widemul for multiplication/division because the cost of the integer division should dwarf the checks for overflow and the use of a bigger integer type, even Int128. But digits is substantially more expensive than everything, so that would be a no-go for overflow checks. Luckily it doesn't seem necessary.
Regarding implicit loss of precision, I agree in general for converting to FixedDecimal, though operations like multiplication and division do implicitly lose precision (through rounding), and I'm fine with that as long as we round correctly at all times.

TotalVerb · 2017-05-23T22:23:41Z

To provide some more context, here's a quick benchmark:

julia> X = collect(1:100);

julia> Y = collect(1:100);

julia> @btime $X .+ $Y;
  56.707 ns (1 allocation: 896 bytes)

julia> @btime Base.Checked.checked_add.($X, $Y);
  157.888 ns (1 allocation: 896 bytes)

julia> @btime widemul.($X, $Y);
  163.220 ns (1 allocation: 1.77 KiB)

julia> @btime $X .* $Y;
  68.801 ns (1 allocation: 896 bytes)

julia> @btime div.($X, $Y);
  800.957 ns (1 allocation: 896 bytes)

julia> @btime divrem.($X, $Y);
  1.446 μs (1 allocation: 1.77 KiB)

julia> @btime 10 .^ $X;
  528.085 ns (1 allocation: 896 bytes)

julia> @btime digits.($X);
  5.047 μs (101 allocations: 10.27 KiB)

some observations:

checked_add is almost three times slower than +, so in addition I would prefer to avoid it
widemul is almost three times slower than *, BUT in multiplication we're doing a divrem which is much worse, so I'm happy to use it (relatively speaking, it's not that expensive)
10 ^ x isn't cheap but not worse than integer division
digits is too expensive for most operations, so we should not use that

omus · 2017-05-23T23:05:07Z

digits is too expensive for most operations, so we should not use that

I'll note that the digits function is only being used within exp10(::Type{FD{T,f}}) which is a generated function and only be called once per distinct FD{T,f} type.

TotalVerb · 2017-05-23T23:06:29Z

@omus Ah, yeah, that's fine then. We should replace this generated function with a Base.@pure function if possible, as that should also be evaluated at compile time, with less overhead.

When the number of decimal places `f` produces a coefficient `10^f` which is not representable in the container type `T` we can run into overflow issues. Additionally, if the FixedDecimal value is also large we can also run into overflow when working with a representable coefficient. The new function `coefficient` provides us with a representable power of 10 which uses a type which will never overflow when multiplied by the value.

Using an `@eval` in the tests to make it clear what `T` and `f` caused the failure.

Actually, the original code should always work given that `isinteger` doesn't have overflow problems. I updated the code to match the rest of the recent changes but unfortunately it appears impossible to test.

omus · 2017-05-24T20:57:19Z

I ended up using max_exp10 in the FixedDecimal constructor since typemax(IntType) >= 10^f has overflow issues. I ended up improving the performance of max_exp10 which makes the performance comparable to T(10)^f with the exception of Int128 and UInt128 which widen to BigInts.

coveralls · 2017-05-24T21:27:35Z

Coverage decreased (-0.4%) to 96.855% when pulling d927bdb on cv/limits into 4dd5492 on master.

TotalVerb

I think we can remove the BigInt special case. Otherwise, LGTM (after a rebase). Thanks!

TotalVerb · 2017-05-29T20:57:07Z

src/FixedPointDecimals.jl

+
+The highest value of `x` which does not result in an overflow when evaluating `T(10)^x`.
+"""
+function max_exp10{T <: Integer}(::Type{T})


We can do a branch here so that

if !applicable(typemax, T) return typemax(Int) end

which would be a little nicer than the current approach in that it will handle arbitrary-precision integers that are not BigInt properly (e.g. if someone had their own custom implementation).

I think I would rather do:

if !applicable(typemax, T) throw(DomainError()) end

or alternatively

if !applicable(typemax, T) return -1 end

Returning a large exponent which is guaranteed to overflow doesn't seem like the right answer.

But typemax(Int) won't "overflow" for BigInt or other types that don't define typemax. It would just simplify the implementation of the constructor so as not to require special-casing BigInt.

What I mean is that T(10)^x where T is BigInt and x = typemax(Int) does actually overflow:

julia> BigInt(10)^typemax(Int) gmp: overflow in mpz type signal (6): Abort trap: 6 while loading no file, in expression starting on line 0 __pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line) Allocations: 3959986 (Pool: 3958822; Big: 1164); GC: 25 Abort trap: 6

I agree with your point that the FixedDecimal constructor shouldn't have the BigInt special case so I'll try to generalize the code in a different way.

omus · 2017-05-29T22:26:13Z

I think we can remove the BigInt special case. Otherwise, LGTM (after a rebase). Thanks!

I'll push the changes and let you review before I rebase.

Using -1 to indicate that the given type should not overflow.

TotalVerb

lgtm, thanks!

omus · 2017-05-31T14:51:32Z

I'm going to do a merge instead of our typical rebase.

omus requested a review from TotalVerb May 23, 2017 20:29

TotalVerb requested changes May 23, 2017

View reviewed changes

TotalVerb mentioned this pull request May 23, 2017

Fix overflow issues with division #9

Merged

omus added 18 commits May 24, 2017 15:48

Fix show displaying wrong value

f468058

Generalize some print tests

b3a06b8

Word wrap to 92-characters

97b3017

Perform widening outside of coefficient

8e7b4ec

Make sure coefficient doesn't overflow

031543c

Fix convert at the input limits

2633540

Using an `@eval` in the tests to make it clear what `T` and `f` caused the failure.

Fix trunc, floor, and ceil at the input limits

d15e2dc

Fix multiply at the input limits

cc62fc0

Fix division at the input limits

2aec095

Additional division changes

13f5612

Improve trunc, floor, ceil limit tests

3690d95

Fix trunc of FD at the input limits

18be9a7

Fix convert from Integer at the input limits

4856c57

Fix floor(::FD) at the input limits

19f0633

Fix ceil(::FD) at the input limits

21843e8

Fix round(::FD) at the input limits

dab3874

Revise rounding tests

fa45e36

omus added 11 commits May 24, 2017 15:49

Revise limit testsets

05137bc

Convert from Rational to FD must be exact

3d9332b

Fix convert(::Type{FD}, ::FD) at the input limits

83e0b42

Fix convert to Float at the input limits

f5a8aa3

Fix isinteger at the input limits

9a5acef

Fix convert to Integer at the input limits

8c3e1d9

Actually, the original code should always work given that `isinteger` doesn't have overflow problems. I updated the code to match the rest of the recent changes but unfortunately it appears impossible to test.

Fix convert to Rational at the input limits

373e002

Improve performance of max_exp10

2bf80f2

Fix type parameter spacing

e61df92

Restricting precision based upon storage type

f1ad20e

Remove unneeded function

d927bdb

omus force-pushed the cv/limits branch from 197cca0 to d927bdb Compare May 24, 2017 20:50

Test constructor

007523b

TotalVerb reviewed May 29, 2017

View reviewed changes

max_exp10 now returns -1 for BigInt

859f054

Using -1 to indicate that the given type should not overflow.

TotalVerb approved these changes May 30, 2017

View reviewed changes

omus merged commit 859f054 into master May 31, 2017

omus added a commit that referenced this pull request May 31, 2017

Merge branch 'cv/limits' (#8)

61fcdb6

omus deleted the cv/limits branch May 31, 2017 14:58

omus mentioned this pull request May 31, 2017

Fix float conversion overflow problems #11

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid overflow when working with `powt` #8

Avoid overflow when working with `powt` #8

omus commented May 23, 2017

TotalVerb left a comment

TotalVerb May 23, 2017

TotalVerb May 23, 2017

omus May 23, 2017

TotalVerb May 23, 2017

TotalVerb May 23, 2017

omus May 23, 2017

TotalVerb May 23, 2017 •

edited

Loading

TotalVerb May 23, 2017

TotalVerb May 23, 2017

omus May 23, 2017

TotalVerb May 23, 2017

omus May 23, 2017

codecov-io commented May 23, 2017 •

edited

Loading

coveralls commented May 23, 2017 •

edited

Loading

coveralls commented May 23, 2017

coveralls commented May 23, 2017

omus commented May 23, 2017

TotalVerb commented May 23, 2017

TotalVerb commented May 23, 2017

omus commented May 23, 2017

TotalVerb commented May 23, 2017

omus commented May 24, 2017

coveralls commented May 24, 2017 •

edited

Loading

TotalVerb left a comment

TotalVerb May 29, 2017

omus May 29, 2017

TotalVerb May 29, 2017

omus May 29, 2017

omus commented May 29, 2017

TotalVerb left a comment •

edited

Loading

omus commented May 31, 2017

Avoid overflow when working with powt #8

Avoid overflow when working with powt #8

Conversation

omus commented May 23, 2017

TotalVerb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TotalVerb May 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented May 23, 2017 • edited Loading

Codecov Report

coveralls commented May 23, 2017 • edited Loading

coveralls commented May 23, 2017

coveralls commented May 23, 2017

omus commented May 23, 2017

TotalVerb commented May 23, 2017

TotalVerb commented May 23, 2017

omus commented May 23, 2017

TotalVerb commented May 23, 2017

omus commented May 24, 2017

coveralls commented May 24, 2017 • edited Loading

TotalVerb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

omus commented May 29, 2017

TotalVerb left a comment • edited Loading

Choose a reason for hiding this comment

omus commented May 31, 2017

Avoid overflow when working with `powt` #8

Avoid overflow when working with `powt` #8

TotalVerb May 23, 2017 •

edited

Loading

codecov-io commented May 23, 2017 •

edited

Loading

coveralls commented May 23, 2017 •

edited

Loading

coveralls commented May 24, 2017 •

edited

Loading

TotalVerb left a comment •

edited

Loading