Description
In S05 it defines <:Foo> as:
Unicode properties are indicated by use of pair notation in place of a normal rule name:
<:Letter> # a letter <:!Letter> # a non-letter
Properties with arguments are passed as the argument to the pair:
<:East_Asian_Width<Narrow>> <:!Blk<ASCII>>
The second form is unambiguous. The first, less so. Here's a quote from the Unicode database (in PropertyValueAliases.txt):
NOTE: Property value names are NOT unique across properties. For example:
AL means Arabic Letter for the Bidi_Class property, and
AL means Above_Left for the Canonical_Combining_Class property, and
AL means Alphabetic for the Line_Break property.In addition, some property names may be the same as some property value names.
For example:sc means the Script property, and
Sc means the General_Category property value Currency_Symbol (Sc)The combination of property value and property name is, however, unique.
Which raises the question of what <:AL> would mean, or <:Sc>. The one that actually tripped me up is <:space>, which can either be an alias for the WSpace
property (per PropertyAliases.txt):
WSpace ; White_Space ; space
Or a property value name from the linebreak property:
lb ; SP ; Space
The ambiguity is currently resolved by the order we make entries into the lookup hash, which is defined by the order we generate the C code in ucd2c.pl, which in term is randomized due to Perl 5 hash order randomization. So, you can get a spectest fails, regenerate from the exact same Unicode database
version and ucd2c.pl, and "get lucky" next time around. I came upon this by getting "unlucky" when doing the Unicode 9 database version bump, but it's been a problem all along.