Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

titan doc update for release 7.6.0 #15986

Merged
merged 34 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
64ecfce
titan doc update for release 7.6.0
tonyxuqqi Jan 5, 2024
9e58e74
lint issue
Jan 5, 2024
19eb7ef
Apply suggestions from code review
hfxsd Jan 8, 2024
9546630
Apply suggestions from code review
hfxsd Jan 8, 2024
b6dacd0
Update tikv-configuration-file.md
hfxsd Jan 8, 2024
481b007
Apply suggestions from code review
hfxsd Jan 9, 2024
c43a35c
change the default value of blob-file-compression to zstd
hfxsd Jan 9, 2024
3ca3d45
Update tikv-configuration-file.md
hfxsd Jan 9, 2024
a47c5bf
Update tikv-configuration-file.md
hfxsd Jan 9, 2024
02d78bb
Apply suggestions from code review
hfxsd Jan 16, 2024
331bfe1
polish titan doc
tonyxuqqi Jan 17, 2024
8bb32cb
Merge branch 'titan_7.6' of https://github.com/tonyxuqqi/docs into ti…
tonyxuqqi Jan 17, 2024
b86de77
address comments
tonyxuqqi Jan 17, 2024
adb3363
update gc thread count
tonyxuqqi Jan 22, 2024
d8b48fd
update num-threads
tonyxuqqi Jan 22, 2024
cc5fecf
titan: update titan doc for v7.6.0 (enable titan by default)
benmaoer Jan 23, 2024
17c7f43
Merge pull request #1 from benmaoer/15986-titan-doc-updates
tonyxuqqi Jan 23, 2024
3937b19
Merge remote-tracking branch 'upstream/master' into pr/15986
hfxsd Jan 24, 2024
8b8a477
synced cn changes
hfxsd Jan 24, 2024
8bb38a4
Update tikv-configuration-file.md
hfxsd Jan 24, 2024
b6554f7
Update titan-configuration.md
hfxsd Jan 24, 2024
c93019e
Update titan-configuration.md
hfxsd Jan 24, 2024
4b9baf6
Update storage-engine/titan-overview.md
hfxsd Jan 24, 2024
a1bbf0a
Apply suggestions from code review
hfxsd Jan 24, 2024
51e07da
Update storage-engine/titan-configuration.md
hfxsd Jan 24, 2024
4c89679
Update storage-engine/titan-configuration.md
hfxsd Jan 24, 2024
5664bc7
add min blob size link
hfxsd Jan 24, 2024
44f95a4
Apply suggestions from code review
hfxsd Jan 24, 2024
5c6ae2c
Update tikv-configuration-file.md
hfxsd Jan 24, 2024
92ff46a
Apply suggestions from code review
hfxsd Jan 24, 2024
a6ba25f
Update storage-engine/titan-configuration.md
hfxsd Jan 24, 2024
204afcd
Update storage-engine/titan-configuration.md
hfxsd Jan 24, 2024
aa0a9a4
Update tikv-configuration-file.md
hfxsd Jan 24, 2024
665c9f9
Update tikv-configuration-file.md
hfxsd Jan 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Apply suggestions from code review
Co-authored-by: Aolin <aolinz@outlook.com>
  • Loading branch information
hfxsd and Oreoxmt authored Jan 24, 2024
commit 44f95a42482b98e1f565875ba88f320762292367
2 changes: 1 addition & 1 deletion storage-engine/titan-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ It is recommended to set the value of `storage.block-cache.capacity` to the stor

The [`discardable-ratio`](/tikv-configuration-file.md#discardable-ratio) parameter and [`max-background-gc`](/tikv-configuration-file.md#max-background-gc) parameter significantly impact Titan's read performance and garbage collection process.

When the ratio of useless data (the corresponding key has been updated or deleted) in a blob file exceeds the threshold set by [`discardable-ratio`](/tikv-configuration-file.md#discardable-ratio), Titan GC is triggered. Reducing this threshold can reduce space amplification but can cause more frequent Titan GC. Increasing this value can reduce Titan GC, I/O bandwidth, and CPU consumption, but increase disk space usage.
When the ratio of obsolete data (the corresponding key has been updated or deleted) in a blob file exceeds the threshold set by [`discardable-ratio`](/tikv-configuration-file.md#discardable-ratio), Titan GC is triggered. Reducing this threshold can reduce space amplification but can cause more frequent Titan GC. Increasing this value can reduce Titan GC, I/O bandwidth, and CPU consumption, but increase disk space usage.

If you observe that the Titan GC thread is in full load for a long time from **TiKV Details** - **Thread CPU** - **RocksDB CPU**, consider adjusting [`max-background-gc`](/tikv-configuration-file.md#max-background-gc) to increase the Titan GC thread pool size.

Expand Down
28 changes: 14 additions & 14 deletions tikv-configuration-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -1037,7 +1037,7 @@ Configuration items related to Raftstore.
>
> Periodic full compaction is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub.

+ Set the specific times that TiKV initiates periodic full compaction. You can specify multiple time schedules in an array. For example,
+ Set the specific times that TiKV initiates periodic full compaction. You can specify multiple time schedules in an array. For example:
+ `periodic-full-compact-start-times = ["03:00", "23:00"]` indicates that TiKV performs full compaction daily at 03:00 AM and 11:00 PM, based on the local time zone of the TiKV node.
+ `periodic-full-compact-start-times = ["03:00 +0000", "23:00 +0000"]` indicates that TiKV performs full compaction daily at 03:00 AM and 11:00 PM in UTC time.
+ Default value: `[]`, which means periodic full compaction is disabled by default.
Expand Down Expand Up @@ -1230,7 +1230,7 @@ Configuration items related to RocksDB

### `rate-bytes-per-sec`

+ When Titan is disabled, this option limits the I/O rate of RocksDB compaction to reduce the impact of RocksDB compaction on the foreground read and write performance during traffic peaks. When Titan is enabled, this option limits the summed I/O rates of RocksDB compaction and Titan GC. If you find that the I/O or CPU consumption of RocksDB compaction and Titan GC is too large, set this option to a suitable value according the disk I/O bandwidth and the actual write traffic.
+ When Titan is disabled, this configuration item limits the I/O rate of RocksDB compaction to reduce the impact of RocksDB compaction on the foreground read and write performance during traffic peaks. When Titan is enabled, this configuration item limits the summed I/O rates of RocksDB compaction and Titan GC. If you find that the I/O or CPU consumption of RocksDB compaction and Titan GC is too large, set this configuration item to an appropriate value according the disk I/O bandwidth and the actual write traffic.
+ Default value: `10GB`
+ Minimum value: `0`
+ Unit: B|KB|MB|GB
Expand Down Expand Up @@ -1333,7 +1333,7 @@ Configuration items related to Titan.

> **Note:**
>
> - Starting from TiDB v7.6.0, Titan is enabled by default to enhance the performance of writing wide tables and JSON data.
> - To enhance the performance of wide table and JSON data writing and point query, starting from TiDB v7.6.0, the default value changes from `false` to `true`, which means that Titan is enabled by default.
> - Existing clusters upgraded to v7.6.0 or later versions retain the original configuration, which means that if Titan is not explicitly enabled, it still uses RocksDB.
> - If the cluster has enabled Titan before upgrading to TiDB v7.6.0 or later versions, Titan will be retained after the upgrade, and the [`min-blob-size`](/tikv-configuration-file.md#min-blob-size) configuration before the upgrade will be retained. If you do not explicitly configure the value before the upgrade, the default value of the old version `1KB` will be retained to ensure the stability of the cluster configuration after the upgrade.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

Expand All @@ -1352,7 +1352,7 @@ Configuration items related to Titan.

### `max-background-gc`

+ The maximum number of GC threads in Titan. From the **TiKV Details** -> **Thread CPU** -> **RocksDB CPU** panel, if you observe that the Titan GC threads are at full capacity for a long time, consider increasing the size of the Titan GC thread pool.
+ The maximum number of GC threads in Titan. From the **TiKV Details** > **Thread CPU** > **RocksDB CPU** panel, if you observe that the Titan GC threads are at full capacity for a long time, consider increasing the size of the Titan GC thread pool.
+ Default value: `4`
+ Minimum value: `1`

Expand Down Expand Up @@ -1619,16 +1619,16 @@ Configuration items related to `rocksdb.defaultcf`, `rocksdb.writecf`, and `rock

> **Note:**
>
> Enabling Titan in `rocksdb.defaultcf` is supported, but enabling Titan in `rocksdb.writecf` is not supported.
> Titan can only be enabled in `rocksdb.defaultcf`. It is not supported to enable Titan in `rocksdb.writecf`.

Configuration items related to `rocksdb.defaultcf.titan`.

### `min-blob-size`

> **Note:**
>
> - Starting from TiDB v7.6.0, Titan is enabled by default for new clusters to enhance the performance of writing wide tables and JSON data. The default value of the [`min-blob-size`](/tikv-configuration-file.md#min-blob-size) threshold is changed from `1KB` to `32KB`. Values exceeding `32KB` will now be stored in Titan, while other data continues to be stored in RocksDB.
> - For existing clusters upgrading to TiDB v7.6.0 or later versions, if you do not explicitly set `min-blob-size` before the upgrade, it will retain the old default value of `1KB` to ensure stability in the configuration after the upgrade.
> - Starting from TiDB v7.6.0, Titan is enabled by default to enhance the performance of wide table and JSON data writing and point query. The default value of `min-blob-size` changes from `1KB` to `32KB`. This means that values exceeding `32KB` is stored in Titan, while other data continues to be stored in RocksDB.
> - To ensure configuration consistency, for existing clusters upgrading to TiDB v7.6.0 or later versions, if you do not explicitly set `min-blob-size` before the upgrade, TiDB retains the previous default value of `1KB`.
> - A value smaller than `32KB` might affect the performance of range scans. However, if the workload primarily involves heavy writes and point queries, you can consider decreasing the value of `min-blob-size` for better performance.

+ The smallest value stored in a Blob file. Values smaller than the specified size are stored in the LSM-Tree.
Expand All @@ -1641,15 +1641,15 @@ Configuration items related to `rocksdb.defaultcf.titan`.
> **Note:**
>
> - Snappy compressed files must be in the [official Snappy format](https://github.com/google/snappy). Other variants of Snappy compression are not supported.
> - Starting from TiDB v7.6.0, the default value for the variable `blob-file-compression` has changed from `lz4` to `zstd`.
> - Starting from TiDB v7.6.0, the default value of `blob-file-compression` changes from `"lz4"` to `"zstd"`.

+ The compression algorithm used in a Blob file
+ Optional values: `"no"`, `"snappy"`, `"zlib"`, `"bzip2"`, `"lz4"`, `"lz4hc"`, `"zstd"`
+ Default value: `"zstd"`

### `zstd-dict-size`

+ The zstd dictionary compression size. The default value is `"0KB"`, which means to disable the zstd dictionary compression. In this case, Titan's compression is based on single values, but RocksDB compression is based on blocks (`32KB` by default). When the average size of Titan values is less than `32KB`, Titan's compression ratio is smaller than RocksdDB. Taking JSON as an example, Titan store size can be 30% to 50% bigger than RocksDB. The actual compression ratio depends on the value content and the similiarity among different values. You can set `zstd-dict-size` (for example, set it to `16KB`) to enable the zstd dictionary compression to increase the compression ratio. The actual store size can be lower than RocksDB. But the zstd dictionary compression can lead to about 10% throughput regression in a typical read-write workload.
+ The zstd dictionary compression size. The default value is `"0KB"`, which means to disable the zstd dictionary compression. In this case, Titan compresses data based on single values, whereas RocksDB compresses data based on blocks (`32KB` by default). When the average size of Titan values is less than `32KB`, Titan's compression ratio is lower than that of RocksDB. Taking JSON as an example, the store size in Titan can be 30% to 50% larger than that of RocksDB. The actual compression ratio depends on whether the value content is suitable for compression and the similarity among different values. You can enable the zstd dictionary compression to increase the compression ratio by configuring `zstd-dict-size` (for example, set it to `16KB`). The actual store size can be lower than that of RocksDB. But the zstd dictionary compression might lead to about 10% performance regression in specific workloads.
+
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
+ Default value: `"0KB"`
+ Unit: KB|MB|GB
Expand All @@ -1659,7 +1659,7 @@ Configuration items related to `rocksdb.defaultcf.titan`.
+ The cache size of a Blob file
+ Default value: `"0GB"`
+ Minimum value: `0`
+ Recommended value: It is recommended that after the database has stabilized, set the RocksDB block cache (`storage.block-cache.capacity`) to just above 95% of the Block Cache hit rate based on monitoring, and `blob-cache-size` to `(total memory size) * 50% - (size of block cache)`. This is to ensure that the block cache is large enough to cache the entire RocksDB, while keeping the blob cache as large as possible. However, do not set the value of the blob cache too large. Otherwise the block cache hit rate will drop significantly.
+ Recommended value: After database stabilization, it is recommended to set the RocksDB block cache (`storage.block-cache.capacity`) based on monitoring to maintain a block cache hit rate of at least 95%, and set `blob-cache-size` to `(total memory size) * 50% - (size of block cache)`. This is to ensure that the block cache is sufficiently large to cache the entire RocksDB, while maximizing the blob cache size. However, to prevent a significant drop in the block cache hit rate, do not set the blob cache size too large.
+ Unit: KB|MB|GB

### `min-gc-batch-size`
Expand All @@ -1678,13 +1678,13 @@ Configuration items related to `rocksdb.defaultcf.titan`.

### `discardable-ratio`

+ When the ratio of useless data (the corresponding key has been updated or deleted) in a blob file exceeds the following threshold, Titan GC is triggered. When Titan writes the useful data of this blob file to another file, you can use the `discardable-ratio` value to estimate the upper limits of write amplification and space amplification (assuming the compression is disabled).
+ When the ratio of obsolete data (the corresponding key has been updated or deleted) in a Blob file exceeds the following threshold, Titan GC is triggered. When Titan writes the valid data of this Blob file to another file, you can use the `discardable-ratio` value to estimate the upper limits of write amplification and space amplification (assuming the compression is disabled).

Upper limit of write amplification = 1 / discardable_ratio
Upper limit of write amplification = 1 / discardable-ratio

Upper limit of space amplification = 1 / (1 - discardable_ratio)
Upper limit of space amplification = 1 / (1 - discardable-ratio)

From these two equations, you can see that decreasing the value of `discardable_ratio` can reduce space amplification but causes GC to be more frequent in Titan. Increasing the value reduces Titan GC, the corresponding I/O bandwidth, and CPU consumption but increases disk usage.
From these two equations, you can see that decreasing the value of `discardable_ratio` can reduce space amplification but results in more frequent GC in Titan. Increasing the value reduces the frequency of Titan GC, thereby lowering the corresponding I/O bandwidth and CPU usage, but increases disk usage.

+ Default value: `0.5`
+ Minimum value: `0`
Expand Down
Loading