Identity comparison of boxed values #1726

Praetonus · 2017-03-19T20:47:05Z

Previously, the identity comparison for boxed values worked by comparing the addresses of the boxes regardless of their contents. This was leading to confusing behaviour, for example

  let a: U32 = 0
  let b: U32 = 0
  let boxa: Any = a
  let boxb: Any = b
  a is b // True.
  boxa is boxb // False.

Now, the identity of a boxed value (either a numeric primitive or a tuple) depends on its content rather than its address. This affects the is, isnt and digestof operators. Several general changes were made to the code generator in order to make the identity comparison of a boxed value as efficient as possible.

When comparing a possibly boxed value to an unboxed value, the type of the box is checked first by looking at its type descriptor. If the types of the values are the same, the boxed value is unboxed and the comparison is done on the raw objects.

When comparing two possibly boxed values, the check depends on the underlying types.

First, the addresses are compared. If they are the same, the objects have the same identity regardless of whether they are boxed values or not
If the addresses are different, we check whether the objects are two boxed values of the same type. This is done by checking the type IDs with a method described afterwards
- If the values are two boxed numbers, the identity check is done with memcmp without unboxing
- If the values are two boxed tuples, the identity check is done by calling a type-specific __is function through the vtable. This function unboxes the tuples and does the identity check
- If the values aren't boxed or if they don't have the same type, they don't have the same identity.

When computing the digest of a possibly boxed value, we first check whether the value is boxed by checking the type ID with the same method as for the identity comparison. If the value is boxed, a type-specific __digestof function is called through the vtable. This function unboxes the value and computes the digest.

For both identity comparison and digest computing, the compiler generates the checks depending on the possible subtypes of the base types that it is looking at. This means that checking for the identity of an object that cannot be boxed (because the interface type has no numeric primitive/tuple as subtypes) doesn't involve checking whether it is boxed.

This commit changes how type IDs are assigned. Previously, they were assigned incrementally as types were reached without considering boxed types. Now, type IDs are assigned as

Object type IDs: 1, 3, 5, 7, 9, ...
Numeric type IDs: 0, 4, 8, 12, 16, ...
Tuple type IDs: 2, 6, 10, 14, 18, ...

There are two reasons for this

Speed. Checking whether a value is boxed or not consists in checking if its type ID is a multiple of 2, and further discriminating between a boxed number and a boxed tuple consists in checking if it is a multiple of 4. In machine code, this boils down to checking whether the first or second bit of the type ID are set, respectively
Forward compatibility. We won't have to change this method when dynamic code loading is added to Pony

The __is and __digestof functions introduce a change to the reachability algorithm. These functions are automatically added to numeric primitives and tuples, are generated by the compiler and are marked as "internal". An internal method isn't added to subtypes and since it doesn't exist before reachability analysis, it is added to supertypes in order to be visible when generating identity checks.

Closes #1567.

Praetonus · 2017-03-21T18:32:41Z

The Windows tests are crashing in CodegenTraceTest.TraceTupleDynamic. Would anyone with a Windows machine be able to run the test locally and post a the error location and a backtrace here?

chalcolith · 2017-03-21T21:57:30Z

Strange; when I build your branch on my windows 10 machine with VS2015 all tests pass for both release and debug.

Praetonus · 2017-03-21T22:05:51Z

This is very strange indeed... Do you know if it would be possible to print the segfault reason on AppVeyor? There is catchsegv on *nix but I'm not aware of any equivalent on Windows.

chalcolith · 2017-03-21T22:38:05Z

I'm not aware of any, and a quick google doesn't turn anything up.

Praetonus · 2017-03-22T03:38:46Z

I've been able to trace the source of the segfault with the remote desktop supplied by AppVeyor. It turns out it is an out of memory error. My patch on recycling memory through multiple runs of the runtime looks like it wasn't sufficient. I have a few ideas to fix it, I'll detail on the upcoming PR.

Praetonus · 2017-03-22T20:33:01Z

The tests should hopefully pass with the changes to the pool allocator in.

Praetonus · 2017-03-22T20:34:13Z

Well, not if I don't rebase...

Previously, the identity comparison for boxed values worked by comparing the addresses of the boxes regardless of their contents. This was leading to confusing behaviour, for example let a: U32 = 0 let b: U32 = 0 let boxa: Any = a let boxb: Any = b a is b // True. boxa is boxb // False. Now, the identity of a boxed value (either a numeric primitive or a tuple) depends on its content rather than its address. This affects the `is`, `isnt` and `digestof` operators. Several general changes were made to the code generator in order to make the identity comparison of a boxed value as efficient as possible. When comparing a possibly boxed value to an unboxed value, the type of the box is checked first by looking at its type descriptor. If the types of the values are the same, the boxed value is unboxed and the comparison is done on the raw objects. When comparing two possibly boxed values, the check depends on the underlying types. - First, the addresses are compared. If they are the same, the objects have the same identity regardless of whether they are boxed values or not - If the addresses are different, we check whether the objects are two boxed values of the same type. This is done by checking the type IDs with a method described afterwards - If the values are two boxed numbers, the identity check is done with `memcmp` without unboxing - If the values are two boxed tuples, the identity check is done by calling a type-specific `__is` function through the vtable. This function unboxes the tuples and does the identity check - If the values aren't boxed or if they don't have the same type, they don't have the same identity. When computing the digest of a possibly boxed value, we first check whether the value is boxed by checking the type ID with the same method as for the identity comparison. If the value is boxed, a type-specific `__digestof` function is called through the vtable. This function unboxes the value and computes the digest. For both identity comparison and digest computing, the compiler generates the checks depending on the possible subtypes of the base types that it is looking at. This means that checking for the identity of an object that cannot be boxed (because the interface type has no numeric primitive/tuple as subtypes) doesn't involve checking whether it is boxed. This commit changes how type IDs are assigned. Previously, they were assigned incrementally as types were reached without considering boxed types. Now, type IDs are assigned as - Object type IDs: 1, 3, 5, 7, 9, ... - Numeric type IDs: 0, 4, 8, 12, 16, ... - Tuple type IDs: 2, 6, 10, 14, 18, ... There are two reasons for this - Speed. Checking whether a value is boxed or not consists in checking if its type ID is a multiple of 2, and further discriminating between a boxed number and a boxed tuple consists in checking if it is a multiple of 4. In machine code, this boils down to checking whether the first or second bit of the type ID are set, respectively - Forward compatibility. We won't have to change this method when dynamic code loading is added to Pony The `__is` and `__digestof` functions introduce a change to the reachability algorithm. These functions are automatically added to numeric primitives and tuples, are generated by the compiler and are marked as "internal". An internal method isn't added to subtypes and since it doesn't exist before reachability analysis, it is added to supertypes in order to be visible when generating identity checks. Closes ponylang#1567.

Praetonus added the changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge label Mar 19, 2017

Praetonus force-pushed the boxed-identity branch 3 times, most recently from ee2da97 to 680f389 Compare March 21, 2017 16:20

Praetonus added the do not merge This PR should not be merged at this time label Mar 21, 2017

Praetonus force-pushed the boxed-identity branch from a46fbe3 to c1f74e0 Compare March 21, 2017 18:11

Praetonus force-pushed the boxed-identity branch from 2efa0e3 to a6f7c74 Compare March 22, 2017 02:57

Praetonus mentioned this pull request Mar 22, 2017

Change to dynamic memory allocation/commit on windows #1647

Closed

Praetonus force-pushed the boxed-identity branch from a6f7c74 to c1f74e0 Compare March 22, 2017 20:31

Praetonus force-pushed the boxed-identity branch from c1f74e0 to 6a86a39 Compare March 22, 2017 20:34

Praetonus removed the do not merge This PR should not be merged at this time label Mar 22, 2017

jemc merged commit b99d5b6 into ponylang:master Mar 22, 2017

ponylang-main added a commit that referenced this pull request Mar 22, 2017

Update CHANGELOG for PR #1726 [skip ci]

02c2f5e

Praetonus deleted the boxed-identity branch March 22, 2017 23:27

SeanTAllen mentioned this pull request Mar 27, 2017

32-bit linux compilation error #1760

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identity comparison of boxed values #1726

Identity comparison of boxed values #1726

Praetonus commented Mar 19, 2017

Praetonus commented Mar 21, 2017

chalcolith commented Mar 21, 2017

Praetonus commented Mar 21, 2017

chalcolith commented Mar 21, 2017

Praetonus commented Mar 22, 2017

Praetonus commented Mar 22, 2017

Praetonus commented Mar 22, 2017

Identity comparison of boxed values #1726

Identity comparison of boxed values #1726

Conversation

Praetonus commented Mar 19, 2017

Praetonus commented Mar 21, 2017

chalcolith commented Mar 21, 2017

Praetonus commented Mar 21, 2017

chalcolith commented Mar 21, 2017

Praetonus commented Mar 22, 2017

Praetonus commented Mar 22, 2017

Praetonus commented Mar 22, 2017