-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go-concourse: SaveConfig endpoint: improve error message given to the end user #797
go-concourse: SaveConfig endpoint: improve error message given to the end user #797
Conversation
@Pix4D/integration fishing for comments... |
I understand what this PR is about because we discussed about it, but for somebody else, the PR description is not enough. You could add an example of what the user sees before and after the PR, for example. |
It was already here but as a PR comment not the description.. I now updated also the PR comment. |
return false, false, []ConfigWarning{}, internal.UnexpectedResponseError{ | ||
StatusCode: response.StatusCode, | ||
Status: response.Status, | ||
Body: string(body), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this means it will try to convert the body to a string and as such assume the bytes in body to be a valid UTF-8 string. What if it's not the case? After all, this is not some application/json
so there is no reason it be valid UTF-8 either.
Or does this generate a string representation of any byte sequence using some kind of escaping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://go.dev/blog/strings#what-is-a-string
In Go, a string is in effect a read-only slice of bytes.
It’s important to state right up front that a string holds arbitrary bytes. It is not required to hold Unicode text, UTF-8 text, or any other predefined format. As far as the content of a string is concerned, it is exactly equivalent to a slice of bytes.
Here is a string literal (more about those soon) that uses the \xNN notation to define a string constant holding some peculiar byte values. (Of course, bytes range from hexadecimal values 00 through FF, inclusive.)
const sample = "\xbd\xb2\x3d\xbc\x20\xe2\x8c\x98"
Printing strings
Because some of the bytes in our sample string are not valid ASCII, not even valid UTF-8, printing the string directly will produce ugly output. The simple print statement
fmt.Println(sample)
produces this mess (whose exact appearance varies with the environment):
��=� ⌘
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to be concerned, nor to discover what is the encoding. If is not UTF-8, no harm is done, and is by definition a bug on the other end of the communication link.
For the history, the inventor of UTF-8 is the same person who invented Go and that wrote the blog post cited above, Rob Pike.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is no chance this can panic string([]byte(30, 210, 0, 115, 171, 50, 85, 149, 235, 145))
? and that is because the input is always a slice of bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is impossible to panic due to conversion of a byte slice to a string.
if we had the possibility for i=1; string(i), that would be an issue?
that would be string([]byte{1})
, a string like another.
I strongly suggest to read the blog post above.
There’s more. The %q (quoted) verb will escape any non-printable byte sequences in a string so the output is unambiguous.
fmt.Printf("%q\n", sample)
This technique is handy when much of the string is intelligible as text but there are peculiarities to root out; it produces:
"\xbd\xb2=\xbc ⌘"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Thanks.
Just curious, does that mean that go has no native way to manipulate text? Or does the standard lib has at least a bunch of function to properly handle de/en-coding and glyphs count / manipulation operations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From that blog post again:
To summarize, here are the salient points:
Go source code is always UTF-8.
A string holds arbitrary bytes.
A string literal, absent byte-level escapes, always holds valid UTF-8 sequences.
Those sequences represent Unicode code points, called runes.
No guarantee is made in Go that characters in strings are normalized.
...
there’s really only one way that Go treats UTF-8 specially, and that is when using a for range loop on a string.
...
A for range loop, by contrast, decodes one UTF-8-encoded rune on each iteration.
const nihongo = "日本語"
for index, runeValue := range nihongo {
fmt.Printf("%#U starts at byte position %d\n", runeValue, index)
}
The output shows how each code point occupies multiple bytes:
U+65E5 '日' starts at byte position 0
U+672C '本' starts at byte position 3
U+8A9E '語' starts at byte position 6
[Exercise: Put an invalid UTF-8 byte sequence into the string. (How?) What happens to the iterations of the loop?]
See the Exercise! :-)
Go’s standard library provides strong support for interpreting UTF-8 text. If a for range loop isn’t sufficient for your purposes, chances are the facility you need is provided by a package in the library.
The most important such package is unicode/utf8, which contains helper routines to validate, disassemble, and reassemble UTF-8 strings.
Here is a program equivalent to the for range example above, but using the DecodeRuneInString function
const nihongo = "日本語"
for i, w := 0, 0; i < len(nihongo); i += w {
runeValue, width := utf8.DecodeRuneInString(nihongo[i:])
fmt.Printf("%#U starts at byte position %d\n", runeValue, i)
w = width
}
To answer the question posed at the beginning: Strings are built from bytes so indexing them yields bytes, not characters. A string might not even hold characters. In fact, the definition of “character” is ambiguous and it would be a mistake to try to resolve the ambiguity by defining that strings are made of characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary I remember by heart:
Strings are built from bytes so indexing them yields bytes, not characters. A string might not even hold characters. In fact, the definition of “character” is ambiguous and it would be a mistake to try to resolve the ambiguity by defining that strings are made of characters.
c707e28
to
424e986
Compare
424e986
to
d4f93a5
Compare
@marco-m-pix4d @odormond I had to add 1 more commit: 305ad9c because tests were failing... I think we should extend also the quick tests here: https://github.com/Pix4D/ci-for-concourse/blob/32e10b73505e24c830ac3bc2338256323dee0502/ci/tasks/scripts/quick-tests.sh#L12 to include them... |
Could you please explain? If a test was failing, it means it has been ran, so why do we need to include it ? |
Ups sorry, my bad, I should have been clearer... Its failing in: So my question is: should I extend our |
I am still confused. This means that we will run the same tests twice... |
Yes, but we already do that... Compare: There will be an overlap between these two jobs... one runs almost all of them (upstream-tests) and one only due to the code changes we did... |
Now I understand. This is a tough one 🤔 |
@@ -204,6 +205,10 @@ var _ = Describe("ATC Handler Configs", func() { | |||
|
|||
Expect(receivedConfig).To(Equal(expectedConfig)) | |||
|
|||
if contentType != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-m-pix4d In one comment you basically said to remove this check and just do:
w.Header().Set("Content-Type", "application/json")
this would require adding new separate tests for the case where content type is not set or set to something else...
I do understand your point and I agree it is clearer that way and more in line with how we test with go test
, but this would lead to a lot of duplication and would go against how the tests are written now... I mean just look how they overwrite the returnBody
or returnHeader
in each test... The approach I took here is more in line what they currently have.. by overwriting a single parameter in each test.
I just wanted to confirm before committing (I already did the changes) that this is what we want:
-
all existing test will test the path of the code when content type header is set to
application/json
-
remove 3 current tests with context
when response contains bad JSON
because we are not testing the JSON parser -
create 3 new tests to test the path when content type header is left empty or set to some other value different than
application/json
-
price to pay is mostly duplication and going against the current flow of how the these tests are written...
-
we gain more clarity by not putting to much stuff in a single test
I agree with you that this is the direction we should take, I just wanted to highlight the duplication part and going against the current flow...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your summary of my suggestion is correct.
As usual, it is a trade-off and also at least in part a question of taste.
I have only a high-level view of that file, I did not analyze in detail the amount of duplication. As we already discussed in the past, DRY for the sake of DRY can make things worse.
You have seen both approaches, you are in the best position to take a decision. As usual the criteria are:
- if a test case fails, will the next person understand better what is happening with the current approach, or with separate tests?
- if tomorrow somebody has to add another test case, will the addition be less error-prone, and in general easier, with the current approach or with separate tests?
- is the duplication causing or risking to cause any problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marco-m-pix4d done... please have a final pass before I merge...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. The length of the existing tests, and the nested Context
s, make my head spin :-/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aliculPix4D please try to squash some commits if possible.
c2486ba
to
d5a5639
Compare
…he end user Signed-off-by: aliculPix4D <aleksandar.licul@pix4d.com>
d5a5639
to
ac16027
Compare
go-concourse: configs: JSON parsing: improve error message given to the end user
Example:
BEFORE:
NOW: