Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: arbitrary-radix integer literals #28256

Closed
griesemer opened this issue Oct 17, 2018 · 42 comments
Closed

proposal: arbitrary-radix integer literals #28256

griesemer opened this issue Oct 17, 2018 · 42 comments
Labels
FrozenDueToAge Proposal v2 An incompatible library change
Milestone

Comments

@griesemer
Copy link
Contributor

I've brought up this idea several times before informally. I'm filing this issue now for the formal documentation trail.

Currently, Go permits octal, decimal, and hexadecimal integer literals. There's a pending proposal for binary integer literals (#19308) which has wide support.

Proposal:

This is a fully backward-compatible proposal for arbitrary-radix integer literals. We change the integer literal syntax to the following:

int_lit = decimal_lit | octal_lit | radix_lit .
decimal_lit = ( "1" … "9" ) { decimal_digit } .
octal_lit = "0" { octal_digit } .
radix_lit = radix ( "x" | "X" ) radix_digit { radix_digit } .
radix = decimal_lit .

with

radix_digit = "0" … "9" | "A" … "Z" | "a" … "z" .

representing the digit values 0 to 35 (for a maximum radix of 36). The radix must be a decimal literal between 0 and 36, expressing the radix; with the radix value 0 having the same meaning as 16, and the value 1 being invalid.

Examples:

0x10   // same as 16x10 or 16
2x1001 // binary integer literal, same as 9
3x010  // ternary integer literal, same as 3
8x066  // octal integer literal, same as octal 066 or 54
36xz   // integer literal in base 36, value is 35

Discussion:

The beauty of this approach is that it permits arbitrary radix notation, thus removing any future need to expand this again, remove the need for the extra notation for hexadecimal numbers because they are just part of this notation, and at the same time it's fully backward-compatible. The commonly accepted notation for binary integer literals and the respective notation here have the same length and the proposed notation here seems just as intuitive (e.g., 0b1001100 == 2x1001100).

We could go a step further and remove octal literals from the language since they are also easily expressed with this notation, but that's a step that would not be backward-compatible. One way to make that happen w/o introducing bugs would be to disallow non-zero decimal numbers that start with a 0; octal numbers in existing code would then lead to a compiler error and could be fixed. It would also be trivial to have them fixed automatically with a simple tool. Finally, removing octals would eliminate another (albeit mostly academic issue) with them; see #28253. If octals were not supported anymore, one could condense the integer literal syntax to:

int_lit = decimal_digit { decimal_digit } [ ( "x" | "X" ) radix_digit { radix_digit } ] .

Implementation:

The implementation is straight-forward. It would likely slightly simplify some of the scanning code for numeric literals because with this proposals now all such literals simply start with a decimal_lit always. If that value is zero, or between 2 and 36, a subsequent 'x' indicates the actual literal value in that radix. The respective number conversion routines are trivial and would need minimal adjustments.

Impact:

Hard to say. It may be sufficient to just add another notation for binary integer literals per #19308. Or we could do this and lay the issue to rest for good.

@cespare
Copy link
Contributor

cespare commented Oct 17, 2018

In Go, I have never wanted to write an integer literal with radix other than 2, 8, 10, or 16. I have also never read code that would have used such literals, had they existed. Therefore, the benefit seems extremely low.

The fact that the existing hexadecimal syntax doesn't fit directly into the proposed syntax but requires a special case of 0 ≡ 16 significantly detracts from the appeal.

@dr2chase
Copy link
Contributor

I like the idea of removing the leading-zero octal notation.
That's a source of annoying errors, and simplifies explaining the language for new users ("don't do this, you'll be surprised" vs not mentioning alternate base notation till it is needed).

@griesemer
Copy link
Contributor Author

griesemer commented Oct 17, 2018

@cespare I would have formulated your 2nd paragraph slightly differently:

The fact that the existing hexadecimal syntax neatly fits directly into the proposed syntax significantly adds to the appeal.

:-)

@beoran
Copy link

beoran commented Oct 18, 2018

While I see the appeal of having a consistent syntax, I fear this would become a very obscure feature. I never felt then need for anything else but binary, octal, decimal and hexadecimal integer constants. Binary integer literals are useful in many cases involving bit twiddling, octal is useful for file permissions, hexadecimal is useful for compact notation of bytes. But trinary or twentyone-ary, seems to be useful for obfuscation only.

I do like the idea of changing then notation for octals, now it's still the confusing C notation. And I do like the uniform notation you propose. I would just disallow anything else than base 2, 8, 10 and 16 to avoid such obfuscation.

Otherwise, could you please show us a few production open source code bases where the use of such arbitrary radix integer constants would have been beneficial?

@griesemer
Copy link
Contributor Author

I'd be ok with the restriction to 2, 8, 10, and 16, but why? It would make things (a tiny bit) more complicated; the only reason I'd see is that it might perhaps eliminate errors (somebody might write 9x066 rather than 8x066 for a file permission).

I agree that most programmers may not care much about the flexibility here, they'll be just fine that they can write down numbers in all the commonly used radixes (2, 8, 10, 16) w/o extra cost (one extra char for octal) and use a single, uniform notation.

Personally, I think that not having arbitrary radix notation is what prevents us from thinking it might be useful. Now usefulness alone is not a criteria for adding something to the language, but it this case it would address the desire for a binary notation and simplify what we already have, and remove restrictions. Seems like a win-win to me. Keep in mind that there's really strong support for adding binary integer literals, so no matter what, we'd have to make changes in all the same places. The difference is just whether we add one more special case, or whether we simplify all the code in favor for a uniform notation.

Finally, there's also the educational aspect of Go: Having a simple, uniform mechanism here rather than an agglomeration of historical notations seems like a nice cleanup.

Btw., Smalltalk supports arbitrary radix notation, too, using the same syntax but with an 'r' instead of an 'x'. Using the 'x' permits the most common other base notation fit neatly into the system.

@randall77
Copy link
Contributor

I'd be ok with the restriction to 2, 8, 10, and 16, but why?

Because that's 32 = 36-4 fewer bases you need to understand when reading code.

23xag56m? It gets very confusing very quickly. I think I'd rather see ((((10*23+16)*23 + 5)*23 + 6)*23 + 22 or something (an exponent operator would help here).

Hexidecimal is certainly useful. Binary and octal seem marginally useful. Other bases just don't seem useful at all. Certainly their value isn't worth burdening the reader with them.

@beoran
Copy link

beoran commented Oct 18, 2018 via email

@cespare
Copy link
Contributor

cespare commented Oct 18, 2018

I'd be ok with the restriction to 2, 8, 10, and 16, but why?

I don't think we should use this proposed syntax with such a restriction. I think that, if anything, we should just add the 0b syntax for binary literals and be done with it (then Go will have all of base 2, 8, 10, and 16 literals).

a single, uniform notation

I don't agree that this proposal is uniform; it introduces more ways of writing the same integer literals:

  • As you mention in your proposal, the existing octal syntax doesn't match, so there will be two different ways of writing octal integers unless we take the further, backward-incompatible step of removing the current octal syntax.
  • The current way of writing hex integers doesn't exactly fit into the scheme, so the proposal includes a special case for 0 to have the same meaning as 16. There will forever be two ways of writing hex integers: 0x2a and 16x2a.

@griesemer
Copy link
Contributor Author

griesemer commented Oct 19, 2018

@beoran I don't know of a Smalltalk playground offhand (which doesn't require installation), but there is of course Squeak (https://en.wikipedia.org/wiki/Squeak). For documentation see the famous "Blue Book", http://stephane.ducasse.free.fr/FreeBooks/BlueBook/Bluebook.pdf, literals with radixes are described on page 19. And the examples there are limited to radix 8 and 16.

Again, I have no strong feelings regarding restricting a radix to 2, 8, 10, 16, but I also don't think it matters much - people won't use crazy radixes for no good reason. (I suspect it's the small radixes that are interesting. For instance, I can see how I'd use a small-n (3, 5, etc.) radix to encode multiple values of n states in a single int, e.g. for some state on a game board.)

In summary, it really doesn't matter all that much; what people seem to want is binary integer literals, and there's a specific proposal for that. It happens to do what all other languages do (which is good) but it also happens to introduce yet another notation. I've submitted this proposal because I think it's a viable alternative. Especially if we're considering removing/improving the octal notation (which would be a Go 2 item) we'd have to have some replacement. This proposal would resolve all those issues in one fell swoop. Personally, I think this is a more elegant approach for the whole problem of different radix integers, but I'm biased, of course.

I think the decisions that need to be made are:

  1. Do we want a binary integer literal notation? If no, both proposals are moot.
  2. If we have a yes for 1): Do we want to just add the 0b... notation, or alternatively do this proposal (with restrictions to 2, 8, 10, 16; or even just 2, 8, 16).

I think the decision for 2) should take into account:

  1. Do we want to do anything about octals? If no, both proposals are roughly equivalent. If yes, I believe this proposal is stronger as it will address octals uniformly.

@griesemer
Copy link
Contributor Author

@cespare Not to be facetious, but with the 0b notation there will also forever be two ways of writing a "hex" number: 0x2a and 0b00101010 . I'd see that as much bigger problem - there will be plenty of people arguing that one is better than the other. Realistically, with the radix notation, people will stick to the shorter 0x notation rather than 16x (but either way, the actual hex number looks the same).

What you are saying really was one of the reasons for not including 0b from day one: There's already a suitable notation, namely 0x.

@josharian
Copy link
Contributor

For instance, I can see how I'd use a small-n (3, 5, etc.) radix to encode multiple values of n states in a single int, e.g. for some state on a game board.)

There is also the suggestion to support intN for all N from @jimmyfrasche:

another way to handle this would be to create a new class of paramaterized integer types. This is bad syntax, but, for discussion, let's say it's I%N where I is in an integer type and N is an integer constant. All arithmetic with a value of this type is implicitly mod N.

And several real world uses immediately occurred to me:

When working on a RISC-V port, I wanted a uint12 type, since my instruction encoding components are 12 bits; that could have been uint % (1<<12). Lots of bit-manipulation, particularly protocols, could benefit from this.

I can see game states similarly benefitting from intN.

In contrast, I can't think of any real world use cases for arbitrary radix constants. Just another data point.

@beoran
Copy link

beoran commented Oct 19, 2018

To answer your questions, I think, 1. yes we need binary constants because they are useful for bit masks and other bit twiddling. And 3. Dropping C style octals and replacing them is a good idea, because C style octals are a source of beginner bugs. Though I would probably go for 0o765 notation, although seeing the Smalltalk precedent 08x765 would also be ok.
As for 2. Actually I don't care too much either way about the notation, as long as we limit it to bases 2, 8, 16 and maybe 10.

@RalphCorderoy
Copy link

The choice of x is poor because it reads as multiply in 4x23 and I have to correct that, interrupting flow. We've trained ourselves to know 0x is hex, partially by knowing multiplying by zero is pointless. Using x feels like a wheeze just to remove 0x as one of the special cases. gri points out Smalltalk uses r; Ada uses #, as in radix#digits#, but it seems a shame to waste that unused character for this.

The referenced issue calling for binary literals has quite a lot of voices saying they're not needed. I won't paste in arguments from there to here, but it's not clear to me that the case for needing them has been made.

More literal bases is a step towards Perl's There's more than one way to do it and away from Go's gofmt. It widens an argument over how some literals should be written and we'll see calls for gofmt, vet, and lint to stand in judgement. Some programmers have never mastered hex, see that other issue, and will want 2x11111111110 because 0xffe is unreadable.

Yes, that was a deliberate mistake to show few readers will want to count a run of the same digit so then there will be calls for underscores, a la Ada, as separators, with arguments over where to separate. It doesn't matter those programmers won't be coding on your project; you and I will still have to read their project.

It's a shame octal nabbed 0755 instead of 0o755, no capital O allowed, but other than that things seem fine as they are. And deprecation of 0755 for a new octal format can be done, as gri outlined, without adding base 2 or base 2-36.

@as
Copy link
Contributor

as commented Oct 21, 2018

Some programmers have never mastered hex, see that other issue, and will want 2x11111111110 because 0xffe is unreadable.

It's hard to believe there are programmers that can mentally reason about 64 digit binary literals but not hex. The digits are not zero padded so to even determine what bit is set, you need to determine the number of digits in the number. Easy with base16, but are there really any examples of binary integer literals serving a useful purpose other that tables of constants rendered by a monospaced font that are rigorously whitespace alligned or zero padded? The gofmt is not going to move these numbers to the right either. Small values will be difficult to see clearly. I suppose that could solved by using 2b01 and 2b10 though.

@creker
Copy link

creker commented Oct 21, 2018

To make binary literals more readable some languages also allow the use of separators. For example, in C# you can write something like this
0b0010_0110_0000_0011

In fact, C# allows underscore to be used in any numeric literal, not only in binary ones.

In my opinion, even for 64-bit literals binary representation would be much more readable if you need very specific bits to be set. Hex values always require a bit more thinking and conversion in your head even if you know hex perfectly well. It's simple for one byte values but gets harder as you go further (the argument about counting digits applies here even for hex) and add values with multiple bits set where simple pattern of 1, 2, 4, 8, 10... no longer holds and you have to convert to binary in your head or just use a calculator.

@RalphCorderoy
Copy link

Sorry, I didn't mean to drag this back to a rerun of #19308, but to point out that widening the choice of ways to do something, write 0xffe, ripples out into formatting and tools.

What demand there is for base 2-36 could be lessened by two things touched on in earlier comments. Keith gave an example for it being easier to read the manual multiplication and addition for a base 23 number. Syntax for array multiplication, AKA Hadamard product, perhaps introduced for vector instructions, would give an alternative. As he said, an exponent operator would help.

base23 = [6436343, 279841, 12167, 529, 23]
ag56m = base23 * [10, 16, 5, 6, 22]

That would also allow for mixed-radix numbers; units of time being a common example.

Josh referred to intN for all N, e.g. int12. That might be too general, and uintN for 1 ≤ N ≤ 32 good enough for most cases. Verilog has something similar and combined with a bit-catenate operator allows x<<18 | y<<5 | z to be x :: y :: z, with :: picked at random. This removes the overlap error where x should have been shifted by 19 to avoid y, and means typed constants with many bits overlaying fields of varying widths can be written more easily as their parts: uint3(2) :: uint13(0x1f0a) :: uint5(0x0f). This is handy when the fields don't fall on nibble boundaries; compare 0xbe14f.

I'm not strongly arguing for either of these, just pointing out that if there is any movement towards them then they overlap with the need for a base 2-36 notation.

@griesemer griesemer added the v2 An incompatible library change label Oct 24, 2018
@dgryski
Copy link
Contributor

dgryski commented Oct 24, 2018

In fact, C# allows underscore to be used in any numeric literal, not only in binary ones.

Also Perl.

$ perl -E 'say 123_456_789;'
123456789

@themeeman
Copy link

themeeman commented Oct 26, 2018

In fact, C# allows underscore to be used in any numeric literal, not only in binary ones.

Also Perl.

$ perl -E 'say 123_456_789;'
123456789

Also python.

>>> print(123_456_789)
123456789

@hooluupog
Copy link

Java,

jshell> System.out.println(123_456_789)
123456789

@RalphCorderoy
Copy link

I expect there's quite a few languages that permit underscore in some numeric literals. Ada was just the first I encountered. Like ditching 0751 as the octal syntax, these underscores would seem to be orthogonal to whether base 2-36 is required. They can be an aid to readability on long literals, but also allow more formatting choice by the author, and disagreement with everyone else. (Perl accepts 3._1__4_ for 3.14, though warns if warnings are explicitly requested.)

It's tempting to dictate the allowable formats, e.g. integers must either no underscores, or they must be every three digits from the right: 2_718_281. Hex could be split on nibble boundaries, etc. But that rules out splitting based on the field boundaries underlying the literal, e.g. a 12-bit nibble-aligned field.

@alanfo
Copy link

alanfo commented Oct 27, 2018

Although I'd normally welcome improvements to the numeric aspects of the language, I'm finding it very hard to get enthused about this proposal.

The demand just doesn't seem to be there for bases other than 2, 8, 10 and 16 and, even it was, I don't think the change could be made in isolation.

People would then be asking for a simple way to print these numbers out. Currently the formatted print functions in the standard library support only the standard bases with their %b, %o, %d and %x verbs so new verbs would need to be added to print out values for arbitrary bases. In other words what are already very complicated functions would become even more so.

Nor do I like the proposed syntax. The use of the letter 'x' as a divider seems inappropriate as the other radixes have nothing to do with hex and for the highest radixes it's even a digit itself. I also dislike the discontinuity for hex itself when 16 suddenly becomes 0.

It's worth remembering that we already have support for radixes from 2 to 36 in the strconv package with the FormatInt and ParseInt functions. Although string based and hence relatively inefficient, I'd have thought this should be enough for anyone who wants to play around with different radixes for educational purposes.

Although on balance I'd support it, I'm not even sure that adding binary literals (with a 0b prefix) is such a great idea unless a digit separator (such as _) is introduced at the same time. The reality it that once you get past one or two bytes, binary literals become unreadable.

As for octal, if one surveys the current state of C family languages, the traditional ones (C, C++, Java) all use the leading zero notation and the newer ones (Swift, Rust) use an 0o prefix.

It seems to me that compatibility with the former is much more important for Go and that the leading zero notation should therefore be retained. As no one appears to be seriously complaining about this, it's just not worth the hassle of changing it.

Having said that, if binary literals are introduced, then for the sake of consistency I wouldn't necessarily be against adding an alternative 0o prefix for octal with people being advised to prefer that unless they were using cgo.

@creker
Copy link

creker commented Oct 27, 2018

I don't think Go needs compatibility with any language, especially C/C++. Go is already quite different from C family of languages that there's no point in clinging to them. If we're going to look elsewhere we should really look at what modern languages are doing, not the ancient ones that riddled with questionable design decisions and years of backwards compatibility. If we were to add 0b prefix it would be really preferable to also change 0 to 0o just for the sake of consistency. gofixing it would be really easy.

@alanfo
Copy link

alanfo commented Oct 27, 2018

I'm not denying that the leading zero syntax for octal was a questionable design decision for C in the first place. It's more a question of what people expect and anyone coming to Go from the traditional languages is going to expect it to deal with octal literals in the same way.

Also it's not just a matter of go fix changing 0 to 0o. You'd also need to change the language to prevent non-zero integer literals from beginning with 0 at all, otherwise the change could potentially be very confusing.

@griesemer
Copy link
Contributor Author

Given the feedback so far, I am going to narrow the proposal as follows:

  1. We change integer literals to

int_lit = decimal_digit { decimal_digit } [ ( "x" | "X" ) radix_digit { radix_digit } ] .

but only permit 0, 2, and 8 as radix prefix; i.e., integers literals are either decimal, hexadecimal (0x) binary (2x), or octal (8x), and the radix_digit must be within the range 0...radix. (If we wanted to add radix 10 and 16 for regularity, that would be fine, too.)

  1. We disallow octal integer literals in the current form (that is, we disallow non-zero integer literals starting with 0) but have gofmt accept them and automatically rewrite them into the 8x form. For instance, an octal 0677 might be rewritten to 8x677 (or perhaps 8x0677).

This fixes potential confusion with octals, and the stream-lined notation can be extended trivially (and in the obvious manner) should there ever be a need for another radix.

The analogous alternative to this reduced proposal would be #19308 (binary integer literals) modified/extended such that we remove octals in the current form and add the "0o" prefix for octals instead. This alternative would be less regular in notation and extending it would require inventing a new prefix, but otherwise would be about the same.

Independently of this, one might consider #28493 for improving readability of long literals.

@alanfo
Copy link

alanfo commented Oct 31, 2018

I'm still not keen on the 2x and 8x prefixes and would much prefer your alternative notation of 0b and (if we must change octal) 0o.

That would be consistent with the verbs in the formatted print statements and also with what Swift and Rust do.

If we are to have binary literals then, in the interests of readability, I think #28493 is a necessity and it would also help with other long numbers.

@RalphCorderoy
Copy link

Hi @griesemer, I realise from your opening Discussion that reusing the x is to lessen the change, but 8x32 just looks wrong because the x unconsciously reads as times. If hex's syntax had been 0h instead of 0x then I guess the proposal would be 8h32. That reads no better and it's because x and h are both mnemonic for hex and trying to contort them to another purpose goes against that long-learned language-agnostic connection.

0b and 0o have fans because they continue this mnemonic use of the letter. If you want a syntax open to future radixes then adopting a new letter avoids thwarting what's already learnt, e.g. r for radix in 8r32. Rewriting octal is already being considered, partially to avoid the beginner error of leading zeroes on base10. If that just leaves 0x as an oddity, given a new 0r syntax, then, 16r0fc0 is at least consistent, but it's noisy compared to the leaner 0x0fc0 that we all love, and parse without thinking. :-)

@alanfo
Copy link

alanfo commented Oct 31, 2018

That's a good point about x reading like a multiplication symbol.

I find it difficult to imagine any base outside 2, 8, 10 or 16 becoming popular in the future but, if one did, then other languages would also be under pressure to support it. Perhaps a consensus might then emerge on the best notation to use which Go could follow rather than coming up with its own.

@haiitch
Copy link

haiitch commented Oct 31, 2018

Bases other than 2, 8, 10, and 16 are extensively used for example in the handling of Bitcoin, Ethereum, and IPFS (all of which have existing implementations in Go). Whilst it's true that all these projects exist and thrive without having base 32 and base 58 literals available, there is no good reason why programmers who frequently use that base should make their code less readable or less expressive.

I think Robert's proposal is perfect, he doesn't seem to have overlooked anything.

If I were forced to complain about anything, that would be that I'd like to see this feature support up to base 58 for reasons stated above, but I reckon that may be a little too much to ask, because there are various different base encodings for bases above 36. (for example, the alphabet for Bitcoin's base 58 encoding is crafted to remove ambiguity in numbers as read by humans, that's the reason there are no Bitcoin base 58 addresses containing the character l (lowercase L), to avoid confusion with the number 1. So that's the reason that makes me think the base 36 upper bound is good enough, it corrects the glaring omission of base 2, it's a consistent syntax for any integer literal, it promotes readability, and easy learning one single rule for all bases.

It's as perfect a solution as you can get. Good work, @griesemer.

@alanfo
Copy link

alanfo commented Oct 31, 2018

@htrob ISTM that you're really arguing here for a base32 encoding to be added.

base58 would be out of the question because, unless we distinguish between upper and lower case letters (which wouldn't fit in with hex), we simply don't have enough potential digits and, even if we did, some of them are omitted by base58 as you've pointed out yourself.

With the exception of base32hex, base32 also suffers from having several different alphabets which are not consistent with the original proposal

My view is that it's best to process them as strings or byte slices as we do now.

@dr2chase
Copy link
Contributor

dr2chase commented Oct 31, 2018

Or perhaps, for the more exotic bases, better compile-time evaluation of pure functions applied to constant strings. E.g.

x := EthB58("WQERDSFDEdjhjjdk11234567")

That by itself wouldn't make the result eligible for use as a constant, however.

@haiitch
Copy link

haiitch commented Oct 31, 2018

@alanfo This is the part where I said "but I reckon that may be a little too much to ask", which you may have missed.
I am happy already if base2 to base36 are supported as per @griesemer 's design.
I also said I think @griesemer's design is perfect, I'm not sure what is your concern about that, maybe you can illustrate how you think it can be made better.

@alanfo
Copy link

alanfo commented Oct 31, 2018

@htrob Well, I detailed my concerns about the original proposal at some length in my first post to this thread. But, as @griesemer has since narrowed it to only allow 0, 2 and 8 as radix prefixes, there's not much point in going over the same ground again.

The question now is whether 2x or 8x should be preferred to the more familiar 0b and 0o which he offered as an alternative. As I don't like the use of x for various reasons, I'm firmly in the latter camp.

@haiitch
Copy link

haiitch commented Nov 3, 2018

It's always been clear what's your position, and I still honestly believe it's wrong. I already explained exactly why I think it's more practical, readable, and useful to the new Go programmer going forward to accept @griesemer's proposal, which I believe made a far more solid argument than "I don't like".
No amount of "I don't like" is likely to convince me that I shouldn't state what I believe is a better way. It's agree to disagree territory I guess, so... yes, I agree there's no point in going over matters of your personal preference. Cheers.

@cznic
Copy link
Contributor

cznic commented Nov 3, 2018

The status quo in integer literal is IMO more than sufficient wrt what's needed. Even plain decimal only would be perfectly enough, just use a comment.

const LaunchMask = 141836999991328 // 1000 0001 0000 0000 0000 0000 0000 0000 0010 0000 0010 0000

But I don't want to see such monstrosities, as the comment is, in source code. As a comment it's just fine.

@creker
Copy link

creker commented Nov 3, 2018

@cznic with comment you introduced an even bigger problem that's common to comments in general - they could be out of date or plain wrong. Single bit error is enough to throw people off that will inevitably rely on these comments. And no amount of testing would catch that. Even code review may not always catch when at some point someone decides to format it like so

const LaunchMask = 141836999991328 // 10000001 00000000 00000000 00000000 0010000 00100000

Good luck catching an error. At least with binary literals you can write tests.

@cznic
Copy link
Contributor

cznic commented Nov 3, 2018

At least with binary literals you can write tests.

Tests can be written without them as well.

func TestFoo(t *testing.T) {
        n, err := strconv,ParseUint(strings.Replace("1000 0001 0000 0000 0000 0000 0000 0000 0010 0000 0010 0000", " ", "", -1), 2, 64)
        if err != nil {
                t.Fatal(err)
        }

        if g, e := n, 141836999991328; g != e {
                t.Fatal(g, e)
        }
}

Also, I have yet to see a test that tests for the equality of a constant against a literal value. I think vet would not be happy about that.

Can we estimate the share of programs that would ever use, per this proposal, something like an int literal 2x100000010000000000000000000000000010000000100000?

My guess is it might be well less than a promile and that's another reason why I'm not in favor of the proposal.

@creker
Copy link

creker commented Nov 3, 2018

@cznic I fail to see the relevance of this test to my argument. My point is, your comment might be wrong. People will see it, rely on it and report bugs or simply waste time until discovering that the comment was wrong and they need to manually check the bits in the calculator. One can argue that no comment at all would be better.

I'm not talking about testing the exact value of a constant. Its value could mean some feature flags that you could pass to your function during tests. Very common for libraries to have constants with default flags set. With wrong comment tests would be green. With an error in a binary literal tests could immediately catch it.

Can we estimate the share of programs that would ever use, per this proposal, something like an int literal

Binary literals are useful even for much smaller literals. Share of programs would be meaningless as it very much depends on the nature of a program. Binary network protocols, stuff that deals with hardware, emulators - they all could benefit from this proposal. But if we take some REST API service - it doesn't need binary or even hex literals.

@cznic
Copy link
Contributor

cznic commented Nov 3, 2018

Binary literals are useful even for much smaller literals.

But those are IMO way better readable when written in hex.

@griesemer
Copy link
Contributor Author

I am going to retract and close this proposal. With the reduction to 3 radixes at best (0x, 2x, 8x), it doesn't really bring enough "bang for the buck"; especially so if we keep the existing octal notation. Thanks to the initial supporters, but there doesn't seem to be enough community support for this idea at this stage of Go. If we are going to introduce binary integer literals, we should follow established practice in other languages and go with proposal #19308. If we want to introduce another octal notation, we may want to go with the 0o prefix (another more established convention).

Closing.

@beoran
Copy link

beoran commented Nov 7, 2018 via email

@nathany
Copy link
Contributor

nathany commented Nov 29, 2018

Will #19308 be re-opened? Not much more to say, but it seems odd to request feedback in a blog post and then link to a locked issue.

we may want to go with the 0o prefix

That may be more clear than present. It would also be a simple feature to sort out breaking language changes:

  • introduce the new syntax in 1.13
  • eventually remove the existing, slightly more confusing, syntax -- at least for new code

@ianlancetaylor
Copy link
Member

I unlocked #19308.

@golang golang locked and limited conversation to collaborators Nov 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge Proposal v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests