-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming load 60GB csv will OOM #6910
Comments
|
Yes. This query will OOM on my 64GiB PC.
Yep, We don't support For more information apache/opendal#312 |
Could u try use lower parallelism |
It works. After decreasing :) curl -H "insert_sql:insert into ontime format CSV" -H "skip_header:0" -H "field_delimiter:\t" -H 'max_threads: 16' -F "upload=@t_ontime.csv" -XPUT http://root:@127.0.0.1:8000/v1/streaming_load
{"id":"59f22406-f666-4f54-91df-aa46f993f16e","state":"SUCCESS","stats":{"rows":202687655,"bytes":147120170524},"error":null}curl -H "insert_sql:insert into ontime format CSV" -H "skip_header:0" -H -H 4.99s user 127.40s system 10% cpu 21:51.63 total During the insert process, the highest memory recorded is 33.9GiB. |
there's a size limit & timeout in doris's stream load:
|
I noticed that sled supports the io_uring feature, perhaps we could consider enabling it?
before: 9m59s after: 9m12s |
I try to insert into
the memory did not grow. so I guess the bug is in the insert pipeline. @sundy-li |
Ok, it's in |
I tested null engine, it caused oom too. |
this file should use TSV format to load |
What's the difference? And this will prevent us from OOM? |
current implementation assume : |
Follow https://databend.rs/doc/learn/analyze-ontime-with-databend-on-ec2-and-s3
Start (release build) databend-query with local fs.
Databend Query
Dmesg
Runtime info:
cc @youngsofun, can you take a look?
The text was updated successfully, but these errors were encountered: