Closed
Description
Summary
Background
For a read, the main flow is:
- Get the source plan: file name(partition file)
- Read the files by file name which on object storage(like AWS S3).
With the query result cache, we can do:
- Step1. Parse the query, and calculate the fingerprint:
query_id
- Step2. Get the source plan(
read_plan
), and calculate the fingerprint:source_plan_id
- Step3. Check the cache
- 3.1 If the cache is exists:
/query_id/source_plan_id/result
, get and return the result. - 3.2 If the cache is not exists, put the result to the cache
- 3.1 If the cache is exists:
Where the cache stored
Storage in the S3, path is /<bucket>/<tenant>/result/cache/
, and the user can download it.
How to calculate the fingerprint
query_id
need based on the AST?select * from t1 where a>1
fingerprint is sameselect * from t1 where a>1 and 1=1
source_plan_id
based on the partition file name and the file offset