Closed
Description
Feature Request
Teachability, Documentation, Adoption, Migration Strategy:
To reduce CPU usage and write amplification, we file a proposal for new version of PageStorage: Proposal: PageStorage V3 Design
This issue records the progress of PageStorage V3 development.
Development progress
- some preparations and refactoration
- Rand IO benchmark on EBS: https://docs.google.com/document/d/1NoHeigyfNyB9Bp8cYzYFv-In0dQzjR7smLltMBXCmgs/edit#
- Make
PageStorage::Snapshot
an abstract class and make it compatible for V1, V2 and V3 - Refactor out
PageStorage::getEntry
and make it an internal class inside PageStorageV2/V1 - Refactor out the default value in virtual methods of
PageStorage
The framework and related data structure for PageStorage V3BlobStore- SpaceMap
- Read, write logic for BlobStore
- Data compaction for BlobStore
- Multi-disks load balancing
- Read amplification on
PageMap BlobStore::read(FieldReadInfos & to_read, ...)
https://github.com/pingcap/tics/pull/3908/files#diff-1f8b8179cf4110ca801c7a156f936e6c7689f41e1bb967f2c9f46cdaffd9d1ebR345; fixed in https://github.com/pingcap/tiflash/pull/4181/files
MVCC PageDirectory- Basic apply edits, get PageEntry
- GC for cleaning up removed Page
- GC with BlobStore data movement
- GCApply needs to scan all living page ids with a read lock, which could cause high write blocking time. Find a better way for removing external pages; fixed in PageStorage: Ref page lifetime mechanism #4174
- Stale snapshots detection
- Record the thread id, created time while creating a snapshot
- Log warnings when there exist long time snapshot
- Optimization when stale snapshot(s) exist (MVCC GC, BlobStore GC)
WALLog- LogFile format
- Meta format for persisting the PageEntry applied to MVCC PageDirectory
- Compaction of LogFiles
- Multi-disks rolling
Benchmark vs V2Compatibility with other features- Respect WriteLimiter && ReadLimiter
- Respect Encryption at rest
Global PageStorage for one TiFlash instance- Cleanup data in global PageStorage instance when dropping table
Further improvement/features- Tools for debugging PageStorage (
page_ctl
) - Transfering data from old storage to V3
- Check whether use
list
instead ofset
forFilenameSet
is better PageStorage: WALStore #3891 (comment) - Split the pipeline of persisting write batch into smaller parts and speed up the throughput for WALStore
- Compression on serialized entries (in WAL)
Split PRs
- some preparations and refactorationThe framework and related data structure for PageStorage V3BlobStoreMVCC PageDirectoryWALLog
- LogFile format PageStorage: implement LogFile IO #3590
Benchmark vs V2Transfering data from old storage to V3Tools for debugging PageStorage (page_ctl
)
Activity
jiaqizho commentedon Dec 3, 2021
Transfering part similar like : #3269
And just a reference: we need to update page_ctl at last. :)
JaySon-Huang commentedon Dec 3, 2021
Added to the list.
put
andref
are in the same WriteBatch #3829PageEntriesEdit
#3879134 remaining items