Draft: MAINT: Improve performance of common_type and use result_type internally #24656

eendebakpt · 2023-09-06T18:27:45Z

Refactor common_type to use result_type internally. The performance is better than the alternative PR #24467.

Results (not very stable, measured with Windows):

main: 1.8 µs ± 81.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
#24467: 1.13 µs ± 35.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
PR: 799 ns ± 55.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

with script

import numpy as np
x=np.array([1.])
%timeit np.common_type(x, x)

The refactoring of np.common_type is not fully backwards compatible. In particular the polynomial classes break with coefficients of type np.timedelta64 (but that might be acceptable).

seberg · 2023-09-07T09:33:04Z

numpy/lib/_type_check_impl.py

-        return array_type[1][precision]
-    else:
-        return array_type[0][precision]
+    common_type = _multiarray_umath.result_type(*arrays).type


OK, this will make it a bit slower possibly, but I think I would like to:

Use *(a.dtype for a in arrays) because unfortunately result_type supports both array and dtype inputs which would broad the signature here.

We can add 0.0 as an additional argumen tto result_type() to enforce the integer -> int64 transition.

Unfortunately, this will improve the situation when mixing integers and floats: float32 + int16 will actually promote to stay float32. While here any integer forces float64 precision.

This is to make sure people do not pass a dtype directly to common_type? It does make the code a bit slower, but the overhead result is still ok.

We could use common_type = _multiarray_umath.result_type(0., *arrays).type and skip the check on integer type. The pure int goes to float64 (as in current main), but a disadvantage is that now float32+float32 goes to float64 (where before it would be float32)

With

def common_type_with_zero(*arrays): common_type = _multiarray_umath.result_type(0., *[a.dtype for a in arrays]).type if not issubclass(common_type, _nx.inexact): raise TypeError("can't get common type for non-numeric array") return common_type i16=np.array(2, dtype=np.int16) i32=np.array(2, dtype=np.int32) f32=np.array(2, dtype=np.float32) for c in [common_type_original,common_type_pr24467, common_type_pr, common_type_with_zero]: print(f'\n# {c}') print(f'c(i16, i32): {c(i16, i32)}' ) print(f'c(i16, f32): {c(i16, f32)}' ) print(f'c(f32, f32): {c(f32, f32)}' )

I get

# <function common_type_original at 0x000001EE32CE5F80> c(i16, i32): <class 'numpy.float64'> c(i16, f32): <class 'numpy.float64'> c(f32, f32): <class 'numpy.float32'> # <function common_type_pr24467 at 0x000001EE32CE5800> c(i16, i32): <class 'numpy.float64'> c(i16, f32): <class 'numpy.float64'> c(f32, f32): <class 'numpy.float32'> # <function common_type_pr at 0x000001EE32CE51C0> c(i16, i32): <class 'numpy.float64'> c(i16, f32): <class 'numpy.float32'> c(f32, f32): <class 'numpy.float32'> # <function common_type_with_zero at 0x000001EE32CE5DA0> c(i16, i32): <class 'numpy.float64'> c(i16, f32): <class 'numpy.float64'> c(f32, f32): <class 'numpy.float64'>

Performance for main, PR24467, this PR and the variation with zero:

1.24 µs ± 79.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) 740 ns ± 55.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) 333 ns ± 16.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) 998 ns ± 20 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

At this point I have a slight preference for PR 24467, but it is a bit comparing apples and oranges.

To me, those results look like you ran it with result_type(0, *arrays) and not result_type(0., *[a.dtype for a in arrays]). But yes, you still get the change for integers, which is maybe for the better, but is a change.

This really was with result_type(0., *[a.dtype for a in arrays]). For result_type(0, *[a.dtype for a in arrays]) the case common_type(np.array([0]), np.array([0])) results in a TypeError.

@eendebakpt not sure it matters due to the other problems, but you wrote 0 rather than 0. there, which is very different. I just tried just to cover my ground triply: common_type_with_zero will give f32 for f32, f32. (The problem is i16, f32)

@eendebakpt I am still confused at the list np.result_type(0.0, np.int16, np.float32) gives float32 which is a change in behavior, no (or maybe I am mixing up which version is which)?

I am almost tempted to live with that change of behavior since we have a lot of other promotion related changes, but I have to think about it a bit.

(I suppose my ask is just to confirm that it is a change in behavior. In a sense, the old behavior is more like result_type(*(result_type(0.0, a.dtype) for a in arrays))...)

@seberg

np.result_type(0.0, np.int16, np.float32) gives float32 for promotion state legacy and float64 for promotion state weak. But that change is unrelated to this or any of the other PRs (as the PRs do not change the result_type itself)

For common_type the output of common_type(i16, f32) is float64 on main (for both weak and legacy promotion rules). For this PR the output of common_type(i16, f32) is float32 (a behavior change). For MAINT: Improve performance of common_type and use result_type internally #24668 the output is float64 (so no behaviour change

Summary:

MAINT: Optimize performance of np.common_type #24467: Pure optimization, no behaviour changes at all

Draft: MAINT: Improve performance of common_type and use result_type internally #24656: Fastest option, uses result_type internally, behaviour change for i16, f32, does broaden the signature

MAINT: Improve performance of common_type and use result_type internally #24668: Uses result_type internally, which simplifies code. Only behaviour change observed is for datetime64

np.result_type(0.0, np.int16, np.float32) gives float32 for promotion state legacy and float64 for promotion state weak.

what the heck! That looks wrong, that result shouldn't change with weak promotion! Some code path is mistriggring and thinking this cannot have any weak promotion at all or so.

@seberg Not sure whether this helps, but the output depends on how the i16 and f32 are passed (as numpy scalar, numpy array or dtype):

import numpy as np print(np, np.__version__) i16scalar = np.int16(2) f32scalar = np.float32(3.) i16array = np.int16([2, 2]) f32array = np.float32([3., 3.]) for s in ['weak', 'legacy']: print(s) np._set_promotion_state(s) print('np.result_type(0.0, i16scalar, f32scalar)', np.result_type(0.0, i16scalar, f32scalar)) print('np.result_type(0.0, i16array, f32array)', np.result_type(0.0, i16array, f32array)) print('np.result_type(0.0, i16.dtype, f32.dtype)', np.result_type(0.0, i16scalar.dtype, f32scalar.dtype)) print('np.result_type(0.0, f32)', np.result_type(0.0, f32scalar))

Output

import numpy as np print(np, np.__version__) i16scalar = np.int16(2) f32scalar = np.float32(3.) i16array = np.int16([2, 2]) f32array = np.float32([3., 3.]) for s in ['weak', 'legacy']: print(s) np._set_promotion_state(s) print('np.result_type(0.0, i16scalar, f32scalar)', np.result_type(0.0, i16scalar, f32scalar)) print('np.result_type(0.0, i16array, f32array)', np.result_type(0.0, i16array, f32array)) print('np.result_type(0.0, i16.dtype, f32.dtype)', np.result_type(0.0, i16scalar.dtype, f32scalar.dtype)) print('np.result_type(0.0, f32)', np.result_type(0.0, f32scalar))

@seberg I am going to close this PR in favor of #24668 (which includes your two suggestions). The performance of this PR is a better, but maybe not worth the downsides. The only usage of np.common_type is in np.polynomial.polyutils and there we could just inline np.result_type if needed.

I wrote earlier np.result_type(0.0, np.int16, np.float32) gives float32 for promotion state legacy and float64 for promotion state weak, but actually it is the other way around. For posterity: on current main these are the promotion rules

import numpy as np print(np, np.__version__) i16scalar = np.int16(2) f32scalar = np.float32(3.) i16array = np.int16([2, 2]) f32array = np.float32([3., 3.]) for s in ['weak', 'legacy']: print(f'\n\n### promotion {s}') np._set_promotion_state(s) print('np.result_type(i16scalar, f32scalar)', np.result_type(i16scalar, f32scalar)) print('np.result_type(i16array, f32array)', np.result_type( i16array, f32array)) print('np.result_type(i16.dtype, f32.dtype)', np.result_type( i16scalar.dtype, f32scalar.dtype)) print('np.result_type(f32)', np.result_type(f32scalar)) print('np.result_type(0.0, i16scalar, f32scalar)', np.result_type(0.0, i16scalar, f32scalar)) print('np.result_type(0.0, i16array, f32array)', np.result_type(0.0, i16array, f32array)) print('np.result_type(0.0, i16.dtype, f32.dtype)', np.result_type(0.0, i16scalar.dtype, f32scalar.dtype)) print('np.result_type(0.0, f32)', np.result_type(0.0, f32scalar)) print('np.result_type(0.0, np.int32, np.float32)', np.result_type(0.0, np.int32, np.float32)) print() print('np.result_type(0., i16scalar, f32scalar)', np.result_type(0., i16scalar, f32scalar)) print('np.result_type(0., i16scalar.dtype, f32scalar.dtype)', np.result_type( 0., i16scalar.dtype, f32scalar.dtype)) print('np.result_type(0.,i16array, f32array)', np.result_type(0., i16array, f32array)) print('np.result_type(0.,i16array.dtype, f32array.dtype)', np.result_type( 0., i16array.dtype, f32array.dtype))

Results in:

### promotion weak np.result_type(i16scalar, f32scalar) float32 np.result_type(i16array, f32array) float32 np.result_type(i16.dtype, f32.dtype) float32 np.result_type(f32) float32 np.result_type(0.0, i16scalar, f32scalar) float32 np.result_type(0.0, i16array, f32array) float32 np.result_type(0.0, i16.dtype, f32.dtype) float32 np.result_type(0.0, f32) float32 np.result_type(0.0, np.int32, np.float32) float64 np.result_type(0., i16scalar, f32scalar) float32 np.result_type(0., i16scalar.dtype, f32scalar.dtype) float32 np.result_type(0.,i16array, f32array) float32 np.result_type(0.,i16array.dtype, f32array.dtype) float32 ### promotion legacy np.result_type(i16scalar, f32scalar) float32 np.result_type(i16array, f32array) float32 np.result_type(i16.dtype, f32.dtype) float32 np.result_type(f32) float32 np.result_type(0.0, i16scalar, f32scalar) float64 np.result_type(0.0, i16array, f32array) float32 np.result_type(0.0, i16.dtype, f32.dtype) float32 np.result_type(0.0, f32) float64 np.result_type(0.0, np.int32, np.float32) float64 np.result_type(0., i16scalar, f32scalar) float64 np.result_type(0., i16scalar.dtype, f32scalar.dtype) float32 np.result_type(0.,i16array, f32array) float32 np.result_type(0.,i16array.dtype, f32array.dtype) float32

eendebakpt · 2024-02-26T21:57:23Z

Closing in favor of ##24668

eendebakpt added 4 commits September 6, 2023 20:25

MAINT: Improve performance of common_type and use result_type internally

e4b82bb

handle non-numeric case

b101b28

whitespace

d116f6c

pylint

9ac4b36

eendebakpt mentioned this pull request Sep 6, 2023

MAINT: Optimize performance of np.common_type #24467

Open

seberg reviewed Sep 7, 2023

View reviewed changes

eendebakpt mentioned this pull request Sep 7, 2023

MAINT: Improve performance of common_type and use result_type internally #24668

Open

eendebakpt added 2 commits December 1, 2023 20:49

Merge branch 'main' into common_type_v2

3b24869

Merge branch 'main' into common_type_v2

02522e4

eendebakpt closed this Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: MAINT: Improve performance of common_type and use result_type internally #24656

Draft: MAINT: Improve performance of common_type and use result_type internally #24656

eendebakpt commented Sep 6, 2023 •

edited

Loading

seberg Sep 7, 2023

eendebakpt Sep 7, 2023

seberg Sep 7, 2023

eendebakpt Sep 7, 2023

seberg Sep 7, 2023

seberg Sep 8, 2023

eendebakpt Sep 8, 2023

seberg Sep 8, 2023

eendebakpt Sep 13, 2023

eendebakpt Feb 26, 2024

eendebakpt commented Feb 26, 2024

Draft: MAINT: Improve performance of common_type and use result_type internally #24656

Draft: MAINT: Improve performance of common_type and use result_type internally #24656

Conversation

eendebakpt commented Sep 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eendebakpt commented Feb 26, 2024

eendebakpt commented Sep 6, 2023 •

edited

Loading