Tidy up memmove/memcpy APIs #7737

gmarkall · 2022-01-13T21:35:16Z

This cleans up the memcpy / memmove APIs and implementation.

There is now only one memcpy implementation, instead of one using the intrinsic and one using a loop - the implementation always uses the intrinsic now.
The align parameter is removed, which was probably a vestige from an earlier LLVM version and only served to confuse as it was not used.
The APIs permit using a form where either the length is supplied, or a count and item size are supplied - if the count and item size are supplied, and appropriate multiplication is generated.
Updated docstrings for these functions.
Replaced uses of raw_memcpy / raw_memmove with memcpy / memmove.

gmarkall · 2022-01-14T09:46:47Z

gpuci run tests

gmarkall · 2022-01-14T10:06:03Z

The gpuCI fail is unrelated to this PR and is only the result of an unexpected success due to the latest CUDA Python bindings release fixing an upstream issue (profiling APIs are now available) - I'll be making a separate PR to fix this. Otherwise, all tests pass so this is ready for review now.

gmarkall · 2022-01-25T23:40:00Z

gpuci run tests

stuartarchibald

Thanks for undertaking this refactoring. The interface is a lot more consistent with the signatures and also with the functions themselves now! I've left a few small comments to address inline but otherwise looks good. Thanks again!

stuartarchibald · 2022-02-02T15:11:11Z

numba/core/cgutils.py

-        in_ptr = builder.gep(src, [loop.index])
-        builder.store(builder.load(in_ptr), out_ptr)
+    if dst.type != src.type:
+        msg = f'memcpy requires the same types; got {dst.type} and {src.type}'


Suggested change

msg = f'memcpy requires the same types; got {dst.type} and {src.type}'

msg = f'memcpy requires the same types; got source type {src.type} and destination type {dst.type}'

would it be useful to specify which way around these are?

Yes, I think so. I'll keep it destination-first though as that's the first argument to memcpy.

stuartarchibald · 2022-02-02T15:12:31Z

numba/core/cgutils.py

-        builder.store(builder.load(in_ptr), out_ptr)
+    if dst.type != src.type:
+        msg = f'memcpy requires the same types; got {dst.type} and {src.type}'
+        raise TypingError(msg)


Is this a typing problem? Think this should be a lowering error, the caller has passed in invalid llvm types?

Also, should this be tested?

Now I've looked into it, there's no requirement in LLVM for the types to be the same (https://llvm.org/docs/LangRef.html#llvm-memcpy-intrinsic) - this was only needed by the old hand-rolled implementation. In this PR I added the check to enforce the description in the docstring, but I now see it's not necessary I'll remove it from this PR.

stuartarchibald · 2022-02-02T15:14:11Z

numba/core/cgutils.py

+    Source and destination types can be any type, but the types must be the
+    same.


Should this contain a similar check for source type matching dest type as present in the memcpy impl above?

No, this implementation was OK (although the docstring is wrong, and will be updated).

stuartarchibald · 2022-02-02T15:16:40Z

numba/core/cgutils.py

+    if isinstance(itemsize, ir.Constant) and itemsize.constant == 1:
+        length = count
+    else:
+        length = builder.mul(count, itemsize)


I think LLVM would just optimise a multiple by constant one away, but think it's fine to leave this as-is too.

Now you mention it I do think it would be better to have simpler code here, so I'll make it just always generate the mul instead of trying to be clever.

stuartarchibald · 2022-02-02T15:19:55Z

numba/core/cgutils.py

+    else:
+        length = builder.mul(count, itemsize)
+
+    memcpy = builder.module.declare_intrinsic(func_name,


Suggested change

memcpy = builder.module.declare_intrinsic(func_name,

func_impl = builder.module.declare_intrinsic(func_name,

(and refactor, this interface is intended to be generic for memcpy and memmove)?

Will refactor to make this generic.

gmarkall · 2022-03-09T14:51:00Z

gpuci run tests

This issue occurred after the merge from main due to logically-conflicting (but not conflicting in the sense of the merge) changes in main and the issue-7734 branch.

gmarkall · 2022-03-09T15:39:20Z

gpuci run tests

gmarkall · 2022-03-09T16:24:35Z

gpuci run tests

gmarkall · 2022-03-09T16:46:13Z

gpuci run tests

gmarkall · 2022-03-09T17:21:17Z

@stuartarchibald Many thanks for the review and comments. Everything is fixed up from your review, but as can be seen from gpuCI there's a new issue from things that were added to main since I made this PR that I've not yet resolved. Annoyingly it passes on my machines, but only fails on gpuCI so it's taking a little longer than I hoped to resolve this.

gmarkall · 2022-03-10T10:09:43Z

gpuci run tests

gmarkall · 2022-03-11T17:30:41Z

OK, I got to the bottom of this issue. Factors involved in the problem / fix are:

This PR replaces all uses of memcpy in Numba with calls to the @llvm.memcpy intrinsic.
The @llvm.memcpy intrinsic seems to assume its arguments are aligned to the alignment of an i8, which varies across architectures. (The docs are quite vague about this, they do not specify the assumption when the alignment is not specified - https://llvm.org/docs/LangRef.html#llvm-memcpy-intrinsic)
We need to specify alignment of the source and destination parameters to prevent this assumption.
In order to specify alignment, we need to add the align parameter attribute to the source and destination parameters.
llvmlite does not support the align parameter attribute.
The form of memcpy and memcmp differs between LLVM 3.4 and 7.0, so for CUDA Toolkits less than 11.2, an additional transformation needs adding to the NVVM IR to translate the parameter alignment attributes into an additional parameter (it looks like no code has ever generated @llvm.memcpy in conjunction with the CUDA target before.

So, the next steps to get this finished are:

Add support for the align parameter attribute in llvmlite.
Modify the implementation of _raw_mem_intrinsic to add alignment parameter attributes to the source and destination parameters.
Add a transform to rewrite the @llvm.memcpy intrinsic as-needed for NVVM 3.4.

gmarkall · 2022-06-13T09:59:47Z

Rather than attempt to fix support for NVVM 3.4, instead I'll wait to revisit this when NVVM 3.4 support is dropped.

Remove unused align parameter from raw memcpy/memmove

81bbe30

gmarkall added the 2 - In Progress label Jan 13, 2022

gmarkall added this to the Numba 0.56 RC milestone Jan 13, 2022

gmarkall force-pushed the issue-7734 branch from 405eef0 to 7f15490 Compare January 13, 2022 22:46

gmarkall added 2 commits January 13, 2022 22:58

Use raw_memcpy for memcpy

d704ac2

Refactor raw_memmove and rename to memmove

f266be5

gmarkall force-pushed the issue-7734 branch from 7f15490 to d704ac2 Compare January 13, 2022 22:58

gmarkall mentioned this pull request Jan 13, 2022

memcpy_region looks suspiciously wrong #7734

Open

Add to mem{cpy,move} docs and only generate mul when needed

e37d8a3

gmarkall mentioned this pull request Jan 14, 2022

Fixing issue 7693 #7712

Merged

gmarkall marked this pull request as ready for review January 14, 2022 10:06

gmarkall requested review from esc, sklam and stuartarchibald as code owners January 14, 2022 10:06

gmarkall changed the title ~~[WIP] Tidy up memmove/memcpy APIs~~ Tidy up memmove/memcpy APIs Jan 14, 2022

gmarkall added 3 - Ready for Review Effort - medium Medium size effort needed and removed 2 - In Progress labels Jan 17, 2022

stuartarchibald reviewed Feb 2, 2022

View reviewed changes

stuartarchibald added 4 - Waiting on author Waiting for author to respond to review and removed 3 - Ready for Review labels Feb 2, 2022

gmarkall added 5 commits March 9, 2022 14:21

Merge remote-tracking branch 'numba/main' into issue-7734

b22954e

Clarify source and destination times in memcpy error message

966e050

Remove unnecessary constraint on memcpy types

e858073

Simplify generation of length in _raw_mem_intrinsic

f2cb17f

Refactor / simplify _raw_mem_intrinsic

bcee6c3

Fix use of memcpy in record_setattr

3882c0a

This issue occurred after the merge from main due to logically-conflicting (but not conflicting in the sense of the merge) changes in main and the issue-7734 branch.

Align recordwith2darray in test_record_dtype

ba54615

Try aligning memcpy args to 1 byte

91aaa4f

sklam modified the milestones: Numba 0.56 RC, PR Backlog Jun 1, 2022

gmarkall added Blocked awaiting long term feature For PRs/Issues that require the implementation of a long term plan feature and removed 4 - Waiting on author Waiting for author to respond to review labels Jun 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tidy up memmove/memcpy APIs #7737

Tidy up memmove/memcpy APIs #7737

gmarkall commented Jan 13, 2022 •

edited

Loading

gmarkall commented Jan 14, 2022

gmarkall commented Jan 14, 2022

gmarkall commented Jan 25, 2022

stuartarchibald left a comment

stuartarchibald Feb 2, 2022

gmarkall Mar 9, 2022

stuartarchibald Feb 2, 2022

stuartarchibald Feb 2, 2022

gmarkall Mar 9, 2022

stuartarchibald Feb 2, 2022

gmarkall Mar 9, 2022

stuartarchibald Feb 2, 2022

gmarkall Mar 9, 2022

stuartarchibald Feb 2, 2022

gmarkall Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 10, 2022

gmarkall commented Mar 11, 2022

gmarkall commented Jun 13, 2022

	msg = f'memcpy requires the same types; got {dst.type} and {src.type}'
	msg = f'memcpy requires the same types; got source type {src.type} and destination type {dst.type}'

		Source and destination types can be any type, but the types must be the
		same.

	memcpy = builder.module.declare_intrinsic(func_name,
	func_impl = builder.module.declare_intrinsic(func_name,

Tidy up memmove/memcpy APIs #7737

Are you sure you want to change the base?

Tidy up memmove/memcpy APIs #7737

Conversation

gmarkall commented Jan 13, 2022 • edited Loading

gmarkall commented Jan 14, 2022

gmarkall commented Jan 14, 2022

gmarkall commented Jan 25, 2022

stuartarchibald left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gmarkall commented Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 9, 2022

gmarkall commented Mar 10, 2022

gmarkall commented Mar 11, 2022

gmarkall commented Jun 13, 2022

gmarkall commented Jan 13, 2022 •

edited

Loading