-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: shutdown schema change on two permanent errors #14025
Conversation
Can this go in along with the change that consumes it? In general, I'm hesitant about plumbing changes which go in without their consumer because that makes it difficult (impossible) to comment on the high-level direction. Review status: 0 of 5 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending. pkg/sql/pgwire/pgerror/errors.proto, line 17 at r1 (raw file):
everything here is non-nullable, perhaps use proto3? then you can remove all the options. Comments from Reviewable |
Review status: 0 of 5 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending. pkg/sql/pgwire/pgerror/errors.go, line 36 at r1 (raw file):
why are we keeping both pkg/sql/pgwire/pgerror/errors.go, line 37 at r1 (raw file):
what does it mean for a struct to embed an interface?
Comments from Reviewable |
Yes i can do that. I'll send out an update Review status: 0 of 5 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. Comments from Reviewable |
Reviewed 1 of 5 files at r1. pkg/sql/pgwire/pgerror/errors.go, line 36 at r1 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I think the idea here is to continue to play well with pkg/sql/pgwire/pgerror/errors.go, line 37 at r1 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Kind of, but only because it follows the normal embedding rules. Embedding an interface is the same thing as adding an interface field to the struct and also promoting that interface's methods on the struct. pkg/sql/pgwire/pgerror/errors.go, line 81 at r1 (raw file):
I don't think this will work properly in all cases. Imagine a The original idea was to append new information (code, detail, hint, etc.) to the "error cause list" on each wrap/withXYZ method call, just like Passing around a single pkg/sql/pgwire/pgerror/errors_test.go, line 123 at r1 (raw file):
What's prompting the changes to these tests? Comments from Reviewable |
Review status: 1 of 5 files reviewed at latest revision, 5 unresolved discussions, all commit checks successful. pkg/sql/pgwire/pgerror/errors.go, line 81 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
When I first tried introducing this change I was very puzzled by what you had done and it took me sometime to figure out how this all worked. After I made my change and saw that very few tests broke, I figured that sticking with playing well with github.com/pkg/errors while nice is a bit of over-engineering. It makes a lot of sense if you need to stack a bunch of pgError on top of each other but that doesn't seem to be a requirement. I'd prefer that we reduce the complexity of this as done in this change and introduce the compatibility with github.com/pkg/errors if we need it later. This change needs more work. I'll be sending out an update later today Comments from Reviewable |
This PR has a different title because it fixes two bugs. The change to pgerror were only needed to fix one of the bugs. Review status: 0 of 13 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending. Comments from Reviewable |
I'm still not sure I see where the changes to Review status: 0 of 13 files reviewed at latest revision, 5 unresolved discussions, some commit checks failed. pkg/sql/pgwire/pgerror/errors.go, line 81 at r1 (raw file): Previously, vivekmenezes wrote…
We could maintain the full compatibility at a slight increase to runtime complexity if we either:
I'd be very concerned with leaving this PR as is, because the addition of an intermediate Comments from Reviewable |
Review status: 0 of 13 files reviewed at latest revision, 5 unresolved discussions, some commit checks failed. pkg/sql/pgwire/pgerror/errors.go, line 81 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Good point. Let me redo this change to preserve that compatibility. Sorry I didn't get back to this until now. Comments from Reviewable |
cdf3502
to
7b7f737
Compare
This change makes pgerror a lot simpler. It's now a Error protobuf that implements error Review status: 0 of 16 files reviewed at latest revision, 5 unresolved discussions. pkg/sql/pgwire/pgerror/errors.go, line 36 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This now is a pgerror.Error that implements the error interface pkg/sql/pgwire/pgerror/errors.proto, line 17 at r1 (raw file): Previously, tamird (Tamir Duberstein) wrote…
Done. pkg/sql/pgwire/pgerror/errors_test.go, line 123 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
fixed by completely overhauling the tests. Comments from Reviewable |
ignore the change to ConsumerSignal thanks to |
956ed25
to
5693c13
Compare
What's the status of this? Looks like CI failed. |
Looks like it's ready for another round of reviews. I'll give it another look by the end of the day. |
Do take a look at it although it is failing once in a while so I still need to take a look at it later tonight. |
Hey this needs more work, I'm still on it. I had to rebase which messed it up and I need to figure out what to do. |
937d93b
to
04bb74d
Compare
Folks this is ready for review. Thanks! |
4edebab
to
6408488
Compare
@andreimatei ready for another look. |
@andreimatei rebased and ready for review |
Review status: 0 of 24 files reviewed at latest revision, 34 unresolved discussions, all commit checks successful. pkg/sql/schema_changer_test.go, line 774 at r16 (raw file):
consider using Radu's new pkg/sql/schema_changer_test.go, line 775 at r16 (raw file):
s/This/this pkg/sql/schema_changer_test.go, line 807 at r16 (raw file):
also check the pg error code. so hott. pkg/sql/schema_changer_test.go, line 811 at r16 (raw file):
I'm not completely sure what we're testing here and it seems a bit fragile. Can we instead test that the new index doesn't exist (with a pkg/sql/schema_changer_test.go, line 408 at r17 (raw file):
explain that this also includes indexes As I was saying in the previous commit, perhaps there's better ways to test some of the things that this is used for. pkg/sql/schema_changer_test.go, line 815 at r18 (raw file):
I'm a bit confused about what we're doing here. I think we probably don't want to list every "permanent" schema change here, do we? I think the logic tests are a better place for it. pkg/sql/distsqlrun/base.go, line 262 at r11 (raw file): Previously, vivekmenezes wrote…
leave a TODO please pkg/sql/distsqlrun/base.go, line 631 at r11 (raw file): Previously, vivekmenezes wrote…
Well You need it here because you haven't gone all the way with this refactoring and didn't convert pkg/sql/distsqlrun/base.go, line 619 at r14 (raw file):
this should take the NodeId for OriginNode, no? pkg/sql/distsqlrun/base.go, line 636 at r14 (raw file):
add a pkg/sql/distsqlrun/data.proto, line 36 at r11 (raw file): Previously, vivekmenezes wrote…
as we discussed, it is related. Add a TODO please. pkg/sql/distsqlrun/data.proto, line 42 at r11 (raw file): Previously, vivekmenezes wrote…
I understand that the package qualifies it, but my preference would still be for the type names themselves to have more information (same for pkg/sql/distsqlrun/data.proto, line 37 at r14 (raw file):
"zero if does not apply" - how could it not apply? pkg/sql/distsqlrun/outbox_test.go, line 178 at r14 (raw file):
can we test the pkg/sql/distsqlrun/server.go, line 181 at r14 (raw file):
use Using pkg/sql/distsqlrun/server.go, line 183 at r14 (raw file):
don't we want to turn all errors into pg.Error with the "internal error" code? pkg/sql/pgwire/pgerror/errors.go, line 26 at r9 (raw file): Previously, vivekmenezes wrote…
I meant the proto message should be called Comments from Reviewable |
d3ce1e2
to
3843838
Compare
I tried adding a distsql test for this PR but unfortunately it is hanging. I'm keeping the schema change test in this PR as proof that it works until we can get #14770 working Review status: 0 of 26 files reviewed at latest revision, 34 unresolved discussions, some commit checks pending. pkg/sql/schema_changer_test.go, line 775 at r16 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/schema_changer_test.go, line 811 at r16 (raw file): Previously, andreimatei (Andrei Matei) wrote…
We're testing that the number of keys remains the same. pkg/sql/schema_changer_test.go, line 815 at r18 (raw file): Previously, andreimatei (Andrei Matei) wrote…
So I'm leaving this test in here because the distsql test I've created in a separate PR is not working. I'll update this PR with a reference to that PR. pkg/sql/distsqlrun/base.go, line 262 at r11 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/base.go, line 631 at r11 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I've rename GoError() to ErrorDetail() . For now I think that's sufficient. pkg/sql/distsqlrun/base.go, line 619 at r14 (raw file): Previously, andreimatei (Andrei Matei) wrote…
We decided to not add Origin Node id and other characteristics like processor id in this PR. pkg/sql/distsqlrun/base.go, line 636 at r14 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/data.proto, line 36 at r11 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/data.proto, line 42 at r11 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I'm going to ignore this suggestion. Thanks pkg/sql/distsqlrun/data.proto, line 37 at r14 (raw file): Previously, andreimatei (Andrei Matei) wrote…
removed. that doesn't make any sense pkg/sql/distsqlrun/server.go, line 181 at r14 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/server.go, line 183 at r14 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/pgwire/pgerror/errors.go, line 26 at r9 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I don't think that's worth it because externally it will look like pgerror.PGError Comments from Reviewable |
This PR has been lingering around for a while. I'd appreciate if we can merge it before I leave. Thanks! |
Review status: 0 of 26 files reviewed at latest revision, 21 unresolved discussions, some commit checks pending. pkg/sql/distsqlrun/base.go, line 619 at r14 (raw file): Previously, vivekmenezes wrote…
but then remove origin_node and authoritative from the proto pkg/sql/distsqlrun/base.go, line 622 at r23 (raw file):
is this correct? If we wrapped a pkg/sql/distsqlrun/data.proto, line 52 at r11 (raw file): Previously, vivekmenezes wrote…
let's remove this field if we're not using it. Comments from Reviewable |
Review status: 0 of 26 files reviewed at latest revision, 21 unresolved discussions, some commit checks failed. pkg/sql/distsqlrun/base.go, line 619 at r14 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. pkg/sql/distsqlrun/base.go, line 622 at r23 (raw file): Previously, andreimatei (Andrei Matei) wrote…
i think if a pgwire.Error is published on a remote node and its wrapped, it should be unwrapped before being sent over the wire, because on the other end pkg/sql/distsqlrun/data.proto, line 52 at r11 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Done. Comments from Reviewable |
Review status: 0 of 26 files reviewed at latest revision, 21 unresolved discussions, all commit checks successful. pkg/sql/distsqlrun/base.go, line 622 at r23 (raw file): Previously, vivekmenezes wrote…
If Distsql (or anybody else) wraps the error, then it's probably doing it on purpose to lose the structured character of the cause. E.g. if we get a pgerror as part of executing some internal sql - getting a lease or writing a progress report - and we wrap that, we don't want this code to report that code for the original query. Generally no code in distsql will wrap pgerros encountered during the execution of the user's query, will it? Comments from Reviewable |
Review status: 0 of 26 files reviewed at latest revision, 21 unresolved discussions, all commit checks successful. pkg/sql/distsqlrun/base.go, line 622 at r23 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I don't know which way to go here. While you do have a point about internal sql creating a pgerror, although I don't believe it does that. isn't pgerror by definition an "external" error that some point in the code wants to report to the outside world? In the particular case of schema changes, the backfill mechanism wants to report the pgerror to the coordinator which reports it back to the user. If distsql starts wrapping this error and we go with what you're suggesting, the schema change mechanism will start seeing the distsql error and that's not what it wants. I think the rationale here is that an "external" error is being reported and that's what should go across the wire. Comments from Reviewable |
In the original SQL code when we hit upon a SQL error on the gateway node we would create a pgError on the gateway node. In distsql, an error can happen on a remote node during SQL execution which can translate into creating a pgError on a remote node. This change changes pgError to a protobuf so that it can be sent along to the gateway node.
Add a pgerror.Error as an ErrorDetail to Error. fixes cockroachdb#13646
test all the schema change backfill errors that can occur on a remote node. Ensure that they stop the schema change from moving forward and the database is return to its original state.
The test breaks when the backfill runs using distsql because each distsql node separately computes the default value to use its for backfill. Converted to a constant to make the test predictable.
@andreimatei I'm happy to continue the discussion and incorporate in another PR. Thanks for the review |
Review status: 0 of 26 files reviewed at latest revision, 21 unresolved discussions, all commit checks successful. pkg/sql/distsqlrun/base.go, line 622 at r23 (raw file): Previously, vivekmenezes wrote…
This is the way I see it: Comments from Reviewable |
Review status: 0 of 26 files reviewed at latest revision, 21 unresolved discussions, all commit checks successful. pkg/sql/distsqlrun/base.go, line 622 at r23 (raw file): Previously, andreimatei (Andrei Matei) wrote…
Another thing specific to pgerrors: for errors pg codes, we commonly care about the message of the error too, not just the code - cause some client drivers don't give you access to the code so people may need to look at the message. And this can also be broken by wrapping. Comments from Reviewable |
Review status: 0 of 26 files reviewed at latest revision, 21 unresolved discussions, all commit checks successful. pkg/sql/distsqlrun/base.go, line 622 at r23 (raw file): Previously, andreimatei (Andrei Matei) wrote…
I'm very cool with removing the need to unwrap this error, in particular we have tests at the application layer (schema changes) that will detect any bad wrapping. I'll make this change when I return. Comments from Reviewable |
sql: shutdown schema change on two permanent errors
fixes #13872
fixes #13646
In the original SQL code we would hit upon a SQL error
on the gateway node and publish pgError on the gateway
node. In distsql, an error can happen on a remote node
during SQL execution which can translate into creating
a pgError on a remote node. This change adds a protobuf
to pgError so that it can be sent along to the gateway
node.
This change is