-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql, opt: avoid full scans in mutation queries #137781
Conversation
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
1e19a05
to
cff59af
Compare
Informs cockroachdb#79683 Release note (sql change): Added support for a new index hint, AVOID_FULL_SCAN, which will prevent the optimizer from planning a full scan for the specified table if any other plan is possible. The hint can be used in the same way as other existing index hints. For example, SELECT * FROM table_name@{AVOID_FULL_SCAN};. Note that a full scan of a partial index may still be preferred by the optimizer, unless AVOID_FULL_SCAN is used in combination with a specific partial index via FORCE_INDEX=index_name. This hint is similar to NO_FULL_SCAN, but will not error if a full scan cannot be avoided.
07c79dc
to
e599fcd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/opt/exec/execbuilder/testdata/update
line 404 at r3 (raw file):
└── • render │ └── • limit
Need to investigate why we lost the top k here
pkg/sql/opt/exec/execbuilder/testdata/unique
line 5707 at r3 (raw file):
vectorized: true · • root
This seems concerning that we're no longer getting the insert fast path here.
Fixes cockroachdb#79683 Release note (sql change): Added a new session setting avoid_full_table_scans_in_mutations, which when set to true, causes the optimizer to avoid planning full table scans for mutation queries if any other plan is possible. It now defaults to true.
Removed a comment that references a function parameter that no longer exists. Release note: None
e599fcd
to
4721599
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 20 of 20 files at r1, 87 of 87 files at r4, 87 of 87 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mw5h and @rytaft)
pkg/sql/opt/exec/execbuilder/testdata/unique
line 5707 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
This seems concerning that we're no longer getting the insert fast path here.
For some reason the optimizer no longer picks lookup joins into the partial indexes...
pkg/sql/opt/exec/execbuilder/testdata/update
line 404 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Need to investigate why we lost the top k here
I think it's similar to the distinct change—the hugeCost/2 has made the cost of everything else irrelevant.
-- commits
line 13 at r1:
nit: I find the language about this partial index behavior in the comment in coster.go
easier to understand, maybe consider using that for the release note? Or add another sentence to make it ultra clear that a full scan of a partial index on only avoided if the partial index is forced.
pkg/sql/opt/optbuilder/mutation_builder_arbiter.go
line 687 at r5 (raw file):
if h.mb.b.evalCtx.SessionData().AvoidFullTableScansInMutations { indexFlags = &tree.IndexFlags{AvoidFullScan: true} }
The expression built by tableScope()
is never part of a query plan and is never optimized with exploration rules—it is only used to build scalar expressions (partial index predicates) into the metadata. We really shouldn't be building a whole scan expression just to do that, but it was the simplest way to build all the necessary columns into a scope that the partial index predicate may reference (think of virtual computed columns which are a projection over a scan—to support partial index predicates that reference those, we can't simply use the standard table columns). #61298 tracks making this better.
Anyway, the TLDR is that there is no need to use index flags here.
pkg/sql/opt/exec/execbuilder/testdata/cascade
line 928 at r4 (raw file):
│ │ estimated row count: 10 │ │ distinct on: a │ │
This change is odd. I guess the hugeCost/2 has made the plan without the disctint more-or-less the same cost as the plan with the distinct?
pkg/sql/opt/exec/execbuilder/testdata/fk
line 671 at r5 (raw file):
EXPLAIN (OPT) UPDATE child SET p = 4 ---- distribute
Do you know why this distribute operator was added in some of these tests?
pkg/sql/opt/optbuilder/testdata/delete
line 195 at r5 (raw file):
├── columns: x:6!null y:7 z:8 crdb_internal_mvcc_timestamp:9 tableoid:10 column11:11 ├── internal-ordering: -11 ├── sort
This looks like a regression too.
pkg/sql/opt/xform/testdata/rules/limit
line 2775 at r5 (raw file):
│ │ ├── [/1000000001 - /1999999999] │ │ └── [/2000000001 - ] │ ├── flags: avoid-full-scan
I'm not really sure the purpose of this test, but this scan over all partitions + index join is a regression.
pkg/sql/opt/xform/testdata/rules/limit
line 2848 at r5 (raw file):
│ │ ├── [/2 - /4] │ │ ├── [/6 - /4999999] │ │ └── [/5000001 - ]
This is a regression, too, same as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! I'm going to close this PR in favor of https://reviewable.io/reviews/cockroachdb/cockroach/137781. All the comments should be addressed in the new PR.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @mw5h)
Previously, mgartner (Marcus Gartner) wrote…
nit: I find the language about this partial index behavior in the comment in
coster.go
easier to understand, maybe consider using that for the release note? Or add another sentence to make it ultra clear that a full scan of a partial index on only avoided if the partial index is forced.
Fixed in https://reviewable.io/reviews/cockroachdb/cockroach/137781
pkg/sql/opt/optbuilder/mutation_builder_arbiter.go
line 687 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
The expression built by
tableScope()
is never part of a query plan and is never optimized with exploration rules—it is only used to build scalar expressions (partial index predicates) into the metadata. We really shouldn't be building a whole scan expression just to do that, but it was the simplest way to build all the necessary columns into a scope that the partial index predicate may reference (think of virtual computed columns which are a projection over a scan—to support partial index predicates that reference those, we can't simply use the standard table columns). #61298 tracks making this better.Anyway, the TLDR is that there is no need to use index flags here.
Thanks for the info! Fixed in https://reviewable.io/reviews/cockroachdb/cockroach/137781
pkg/sql/opt/exec/execbuilder/testdata/cascade
line 928 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This change is odd. I guess the hugeCost/2 has made the plan without the disctint more-or-less the same cost as the plan with the distinct?
Fixed in https://reviewable.io/reviews/cockroachdb/cockroach/137781
pkg/sql/opt/exec/execbuilder/testdata/fk
line 671 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Do you know why this distribute operator was added in some of these tests?
Fixed in https://reviewable.io/reviews/cockroachdb/cockroach/137781
pkg/sql/opt/exec/execbuilder/testdata/unique
line 5707 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
For some reason the optimizer no longer picks lookup joins into the partial indexes...
Fixed in https://reviewable.io/reviews/cockroachdb/cockroach/137781
pkg/sql/opt/exec/execbuilder/testdata/update
line 404 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I think it's similar to the distinct change—the hugeCost/2 has made the cost of everything else irrelevant.
Fixed in https://reviewable.io/reviews/cockroachdb/cockroach/137781
pkg/sql/opt/optbuilder/testdata/delete
line 195 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This looks like a regression too.
Fixed in https://reviewable.io/reviews/cockroachdb/cockroach/137781
pkg/sql/opt/xform/testdata/rules/limit
line 2775 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I'm not really sure the purpose of this test, but this scan over all partitions + index join is a regression.
This is an unrelated bug, see #137994
pkg/sql/opt/xform/testdata/rules/limit
line 2848 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
This is a regression, too, same as above.
See #137994
Closing in favor of #137984 |
opt,sql: support hint to avoid full scan
Informs #79683
Release note (sql change): Added support for a new index hint,
AVOID_FULL_SCAN
, which will prevent the optimizer from planning afull scan for the specified table if any other plan is possible. The
hint can be used in the same way as other existing index hints. For
example,
SELECT * FROM table_name@{AVOID_FULL_SCAN};
. Note that afull scan of a partial index may still be preferred by the optimizer,
unless
AVOID_FULL_SCAN
is used in combination with a specific partialindex via
FORCE_INDEX=index_name
. This hint is similar toNO_FULL_SCAN
,but will not error if a full scan cannot be avoided.
sql,opt: add setting avoid_full_table_scans_in_mutations
Fixes #79683
Release note (sql change): Added a new session setting
avoid_full_table_scans_in_mutations
, which when set to true, causesthe optimizer to avoid planning full table scans for mutation queries
if any other plan is possible. It now defaults to true.
opt: remove a stale comment above optbuilder.buildScan
Removed a comment that references a function parameter that no longer
exists.
Release note: None