Skip to content

Commit

Permalink
[SYCL][Doc] Remove masked_sub_group from proposal (intel#8308)
Browse files Browse the repository at this point in the history
During implementation, it became clear that the differences between the
behaviors of masked_sub_group and other SYCL groups would require
developers to special-case too many functions. This is at odds with the
goal to support a generic group interface.

The functionality originally provided by masked_sub_group will be
provided via a different mechanism.

Signed-off-by: John Pennycook <john.pennycook@intel.com>
  • Loading branch information
Pennycook authored Feb 10, 2023
1 parent 68214a7 commit a5d62c8
Showing 1 changed file with 5 additions and 219 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ SYCL specification refer to that revision.

This extension also depends on the following other SYCL extensions:

* link:https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_oneapi_sub_group_mask.asciidoc[sycl_ext_oneapi_sub_group_mask]
* link:https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_root_group.asciidoc[sycl_ext_oneapi_root_group]


Expand Down Expand Up @@ -143,7 +142,6 @@ following user-constructed groups:
- `cluster_group`
- `tangle_group`
- `opportunistic_group`
- `masked_sub_group`

The `is_fixed_topology_group` and `is_user_constructed_group` traits can be
used to detect whether a group type represents a fixed topology or
Expand Down Expand Up @@ -175,12 +173,11 @@ namespace sycl::ext::oneapi::experimental {
`root_group`, `group` or `sub_group`.

`is_user_constructed_group<T>::value` is `std::true_type` if `T` is one of:
`ballot_group`, `cluster_group`, `tangle_group`, `opportunisic_group` or
`masked_sub_group`.
`ballot_group`, `cluster_group`, `tangle_group`, or `opportunisic_group`.

Additionally, the `is_group<T>::value` trait from SYCL 2020 is `std::true_type`
if `T` is one of: `ballot_group`, `cluster_group`, `tangle_group`,
`opportunistic_group` or `masked_sub_group`.
if `T` is one of: `ballot_group`, `cluster_group`, `tangle_group`, or
`opportunistic_group`.


=== Group Functions and Algorithms
Expand All @@ -199,13 +196,13 @@ make assumptions regarding work-item scheduling and forward progress
guarantees.

The following group functions support the `ballot_group`, `cluster_group`,
`tangle_group`, `opportunistic_group` and `masked_sub_group` group types:
`tangle_group`, and `opportunistic_group` group types:

* `group_barrier`
* `group_broadcast`

The following group algorithms support `ballot_group`, `cluster_group`,
`tangle_group`, `opportunistic_group` and `masked_sub_group` group types:
`tangle_group`, and `opportunistic_group` group types:

* `joint_any_of` and `any_of_group`
* `joint_all_of` and `all_of_group`
Expand Down Expand Up @@ -898,217 +895,6 @@ int atomic_aggregate_inc(sycl::sub_group sg, sycl::atomic_ref<int, Order, Scope,
----


=== Masked Sub-groups

A masked sub-group is a non-contiguous subset of a sub-group, representing an
arbitrary user-defined subset of work-items. The members of a masked sub-group
are described by a bitmask, where a 1 denotes membership of the group.

The work-items within a masked sub-group retain information about the original
sub-group, and many member functions of the `masked_sub_group` class reflect
this. Developers are strongly recommended to use other user-constructed groups
that match their use-case, both for improved performance and a simplified
mental model.

NOTE: Masked sub-groups exist primarily to support experimentation with
arbitrary subsets of work-items within a sub-group, and to support the
migration of algorithms already expressed via masks.


==== Creation

Masked sub-groups are created by calls to the `get_masked_sub_group()`
function, which applies a bitmask to an existing sub-group.

NOTE: Creating a masked sub-group does not require a barrier across all
work-items in the parent sub-group or introduce any sychronization, since
work-items can independently identify members directly from the specified
membership mask.

[source, c++]
----
namespace ext::oneapi::experimental {
masked_sub_group get_masked_sub_group(sub_group sg, sub_group_mask mask);
} // namespace ext::oneapi::experimental
----

_Preconditions_: All work-items in `sg` with a corresponding bit set in `mask`
must encounter this function in converged control flow.

_Returns_: A `masked_sub_group` consisting of the work-items in `sg` with a
corresponding bit set in `mask`.


==== `masked_sub_group` Class

The `masked_sub_group` class contains an additional `get_mask()` function,
returning the membership mask. Since the other member functions of
`masked_sub_group` reflect the original sub-group, developers must use this
mask to reason about the local numbering of work-items within the group.

[source, c++]
----
namespace sycl::ext::oneapi::experimental {
class masked_sub_group {
public:
using id_type = id<1>;
using range_type = range<1>;
using linear_id_type = uint32_t;
static constexpr int dimensions = 1;
static constexpr sycl::memory_scope fence_scope =
sycl::memory_scope::sub_group;
id_type get_group_id() const;
id_type get_local_id() const;
range_type get_group_range() const;
range_type get_local_range() const;
linear_id_type get_group_linear_id() const;
linear_id_type get_local_linear_id() const;
linear_id_type get_group_linear_range() const;
linear_id_type get_local_linear_range() const;
bool leader() const;
sub_group_mask get_mask() const;
};
}
----

[source,c++]
----
id_type get_group_id() const;
----
_Returns_: An `id` representing the index of the sub-group within the
parent work-group.

[source,c++]
----
id_type get_local_id() const;
----
_Returns_: An `id` representing the calling work-item's position within
the sub-group.

[source,c++]
----
range_type get_group_range() const;
----
_Returns_: A `range` representing the number of sub-groups within the parent
work-group.

[source,c++]
----
range_type get_local_range() const;
----
_Returns_: A `range` representing the number of work-items in the sub-group.

[source,c++]
----
id_type get_group_linear_id() const;
----
_Returns_: A linearized version of the `id` returned by `get_group_id()`.

[source,c++]
----
id_type get_local_linear_id() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_linear_id()`.

[source,c++]
----
range_type get_group_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_group_range()`.

[source,c++]
----
range_type get_local_linear_range() const;
----
_Returns_: A linearized version of the `id` returned by `get_local_range()`.

[source,c++]
----
bool leader() const;
----
_Returns_: `true` for exactly one work-item in the masked sub-group, if the
calling work-item is the leader of the masked sub-group, and `false` for all
other work-items in the masked sub-group. The leader of the masked sub-group
is guaranteed to be the work-item corresponding to the least-significant bit in
the mask.

[source,c++]
----
sub_group_mask get_mask() const;
----
_Returns_: A `sub_group_mask` representing which work-items from the sub-group
are considered a member of this `masked_sub_group`.


==== Usage Example

A `masked_sub_group` can be used to implement algorithms where a membership
mask is already present or easily computed:

[source, c++]
----
// set initial mask to full sub-group
auto sg = it.get_sub_group();
auto active = std::pow(2, sg.get_max_local_range()) - 1;
float sum = x;
for (int shift = sg.get_max_local_range() / 2; shift > 0; shift /= 2)
{
// create representation of work-items still active in this phase
auto masked_sg = sycl::ext::oneapi::experimental::get_masked_sub_group(sg, active);
// call shift only for work-items that are still active
// using the parent sub_group would have been unsafe due to divergence
sum += sycl::shift_group_left(masked_sg, x, shift);
// remove half of the work-items from the group
active >>= shift;
}
----

Note that in many cases these algorithms can be translated (manually) to use
one of the alternative group types:

[source, c++]
----
// set initial mask to full sub-group
auto sg = it.get_sub_group();
float sum = x;
for (int phase = 1; phase < sg.get_max_local_range() / 2; phase *= 2)
{
// create representation of work-items still active in this phase
auto active_group = sycl::ext::oneapi::experimental::get_tangle_group(sg);
// call shift only for work-items that are still active
// note that the shift is now 1, because of how tangle-group local IDs are defined
sum += sycl::shift_group_left(active_group, x, 1);
}
----

Or, even more simply, one of the SYCL group algorithms:

[source, c++]
----
auto sg = it.get_sub_group();
sum = sycl::reduce_over_group(sg, x, sycl::plus<>());
----


== Implementation notes

This non-normative section provides information about one possible
Expand Down

0 comments on commit a5d62c8

Please sign in to comment.