-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update documents for cross-partition scan and import feature #1301
Conversation
docs/schema-loader-import.md
Outdated
You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. There would also be several differences between your database and ScalarDB and limitations. | ||
{% endcapture %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After adding the holistic migration guide, I will add a reference for it around here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updates!
I left some suggestions and questions.
Please take a look when you have time!
docs/api-guide.md
Outdated
{% capture notice--info %} | ||
**Note** | ||
|
||
In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form). | |
In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 6726530.
docs/schema-loader-import.md
Outdated
|
||
1. The value range of `BIGINT` in ScalarDB is from -2^53 to 2^53, regardless of the size of `bigint` in the underlying database. Thus, if the data out of this range exists in the imported table, ScalarDB cannot read it. | ||
2. For certain data types noted above, ScalarDB may map a data type larger than that of the underlying database. In that case, You will see errors when putting a value with a size larger than the size specified in the underlying database. | ||
3. The maximum size of `BLOB` in ScalarDB is about 2GB (precisely 2^31-1 bytes). In contrast, Oracle `blob` can have (4GB-1)*(number of blocks). Thus, if the data larger than 2GB exists in the imported table, ScalarDB cannot read it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Do we observe null
, some value, or an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question again. The 2GB limit is due to Java's byte array limits. I don't test it, but I guess the Oracle JDBC driver throws an SQLException
. Or, we might see an OOM error if the heap size is not correctly configured. Apart from that, we might be able to handle such large objects better by using JDBC Blob getBlob(...)
and offset-based access instead of byte[] getBytes(...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I understood that it depends on Java side.
Thank you!
@kota2and3kan Thanks for the feedback! Fixed based on the feedback (though the exception-related part is left as is for now), so PTAL when you get a chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looking good! Thank you!
Left some comments and suggestions. PTAL!
docs/api-guide.md
Outdated
{% capture notice--warning %} | ||
**Attention** | ||
|
||
Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions. | |
We do not recommend enabling cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions. |
The expression it could make the isolation level lower
sounds a bit unclear to me.
BTW, sorry, I don't fully remember the discussion we had before.
So, we decided to only warn in case users use the cross-partition scan with serializable isolation for backward compatibility instead of throwing a runtime exception, don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. Fixed in e3dd00d.
BTW, sorry, I don't fully remember the discussion we had before.
So, we decided to only warn in case users use the cross-partition scan with serializable isolation for backward compatibility instead of throwing a runtime exception, don't we?
At least, we didn't choose to completely disable the cross-partition scan with serializable isolation in v4.0. Disabling it is one idea, but I think it might be useful in some cases regardless of backward compatibility; e.g., users want to basically run transactions in a serializable manner but sometimes run read-only cross-partition scans without changing the setting.
@brfrn169 Do you have any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thank you! So, for now, we need to enable it in 3.x for backward-compatibility, and we haven't decided to do so in 4.x (we need to think what we should do for 4.x). Is my understanding correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost, yes. My understanding is that we will keep warning it as the same as v3.x unless we make a decision to stop the feature in v4.x explicitly.
docs/api-guide.md
Outdated
##### Execute `Scan` without specifying a partition key to retrieve all the records of a table | ||
##### Execute cross-partition `Scan` without specifying a partition key to retrieve all the records of a table | ||
|
||
You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration. | |
You can execute a `Scan` operation across all partitions, which we call cross-partition scan, without specifying a partition key by enabling the following property in the ScalarDB configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in e3dd00d.
docs/configurations.md
Outdated
{% capture notice--warning %} | ||
**Attention** | ||
|
||
Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in e3dd00d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM. Left a couple of comments. Please take a look when you have time!
docs/api-guide.md
Outdated
```java | ||
// Import the table "ns.tbl". If the table is already managed by ScalarDB, the target table does not | ||
// exist, or the table does not meet the requirement of ScalarDB table, an exception will be thrown. | ||
admin.importTable("ns", "tbl"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have added Map<String, String> options
as 3rd argument to this method:
admin.importTable("ns", "tbl"); | |
admin.importTable("ns", "tbl", options); |
And we didn't add the argument for the 3
branch. To avoid diverging this doc between master
and 3
, so we should probably apply the same change for the 3
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thank you! Fixed in e3dd00d.
And we didn't add the argument for the 3 branch. To avoid diverging this doc between master and 3, so we should probably apply the same change for the 3 branch.
I'm OK to use admin.importTable("ns", "tbl", options)
even in v3.x, but might be confusing for users. If we can just apply it without any other concerns, I can handle it since it's not a big diverge. Should I or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the importTable()
method was already introduced in 3.10, but it was treated as an experiment feature. Therefore, I think we can also add the options
argument to the 3
branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, right. Got it. Thank you!
docs/schema-loader.md
Outdated
// Import tables. | ||
// You can also use a Properties object instead of configFilePath and a serialized-schema JSON | ||
// string instead of schemaFilePath. | ||
SchemaLoader.load(configFilePath, schemaFilePath, tableCreationOptions, createCoordinatorTables); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
importTables
, right?
SchemaLoader.load(configFilePath, schemaFilePath, tableCreationOptions, createCoordinatorTables); | |
SchemaLoader.importTables(configFilePath, schemaFilePath, tableCreationOptions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Fixed in e3dd00d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing, it looks like the importTables
method doesn't receive the createCoordinatorTables
argument.
And maybe tableCreationOptions
should be renamed to something like tableImportOptions
option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, sorry about that. I fixed it throughout the sample in 148d9e2. PTAL!
[skip ci]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
docs/api-guide.md
Outdated
{% capture notice--warning %} | ||
**Attention** | ||
|
||
We do not recommend enabling the cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not recommend enabling the cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions. | |
We do not recommend enabling the cross-partition scan with `SERIALIZABLE` isolation level for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions. |
If you meant scalar.db.consensus_commit.isolation_level
, I think it should be capitalized https://scalardb.scalar-labs.com/docs/latest/configurations/#basic-configurations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in de406a2. Thank you for the feedback!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
[skip ci]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments and suggestions. PTAL!
core/src/main/java/com/scalar/db/transaction/consensuscommit/ConsensusCommitConfig.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/scalar/db/transaction/consensuscommit/ConsensusCommitConfig.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/scalar/db/transaction/consensuscommit/ConsensusCommitConfig.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Josh Wong <joshua.wong@scalar-labs.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!🙇♂️
Co-authored-by: Josh Wong <joshua.wong@scalar-labs.com>
Description
This PR adds documents related to cross-partition scan (so-called the relational scan before) and the table import feature. It depends on #1294, which is still under review, but PTAL in parallel since reviewing and revising the docs might take a long time.
Related issues and/or PRs
Changes made
Checklist
Additional notes (optional)
Release notes
Added documents for cross-partition scan and table import.