Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: tables with non-covering partitions can cause incorrect results when avoiding full scans #137994

Open
rytaft opened this issue Dec 26, 2024 · 1 comment
Labels
branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-sql-queries SQL Queries Team

Comments

@rytaft
Copy link
Collaborator

rytaft commented Dec 26, 2024

Describe the problem

If the partitions of a table do not cover all valid values for the partition columns, the optimizer may create a query plan that produces incorrect results. I noticed this issue while making a recent change, but I think the bug has existed for years.

To Reproduce

With CockroachDB v24.3 or a prior version, run the following in cockroach demo:

CREATE TABLE tab (
  col1_0 NAME,
  col1_1 INT8,
  col1_3 INT8,
  col1_5 VARCHAR,
  PRIMARY KEY (col1_0 ASC),
  UNIQUE (col1_1 ASC, col1_3 ASC)
    PARTITION BY LIST (col1_1,col1_3) (
      PARTITION table1_part_0 VALUES IN (
                                (
                                  1,
                                  NULL
                                )
                              ),
      PARTITION table1_part_1 VALUES IN (
                                (
                                  1000000000,
                                  NULL
                                )
                              ),
      PARTITION table1_part_2 VALUES IN (
                                (
                                  2000000000,
                                  NULL
                                )
                              )
      )
);

INSERT
INTO
	tab
VALUES
	('a', 1, NULL, 'foo1'),
	('b', 1000000000, NULL, 'foo2'),
	('c', 2000000000, NULL, 'foo3'),
	('d', 0, NULL, 'foo4'),
	('aa', 1, 1, 'foo1'),
	('bb', 1000000000, 1, 'foo2'),
	('cc', 2000000000, 1, 'foo3'),
	('dd', 0, 1, 'foo4');

UPDATE tab
     SET col1_5 = 'bar'
   WHERE col1_0 ILIKE col1_0
ORDER BY col1_3
   LIMIT 84;

This should update every row to have col1_5 = 'bar'.

Now re-create the table and insert the rows, but change the UPDATE statement to avoid a full scan:

UPDATE tab@{NO_FULL_SCAN}
     SET col1_5 = 'bar'
   WHERE col1_0 ILIKE col1_0
ORDER BY col1_3
   LIMIT 84;

This will only update 2 rows.

Expected behavior

Both UPDATE statements should update all 8 rows in the table.

Jira issue: CRDB-45844

@rytaft rytaft added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 labels Dec 26, 2024
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Dec 26, 2024
@mgartner
Copy link
Collaborator

Let's spend some time root-causing this so we know how edge-casey this is.

@mgartner mgartner added release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. GA-blocker and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Dec 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-22.1 Used to mark GA and release blockers, technical advisories, and bugs for 22.1 branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-sql-queries SQL Queries Team
Projects
Status: Triage
Development

No branches or pull requests

2 participants