Set scheduler log sizes automatically based on available memory #5570
Labels
diagnostics
documentation
Improve or add to documentation
stability
Issue or feature related to cluster stability (e.g. deadlock)
There are frequent reports of scheduler memory growing over time:
They often involve memory graphs that look like:
It's very likely that there is a real bug in the scheduler causing memory to accumulate (#3898 (comment)), but often the steep slope on these graphs is caused by various logs on the scheduler accumulating, such as:
transition_log
-distributed.scheduler.transition-log-length
log
-distributed.scheduler.transition-log-length
(should maybe bedistributed.admin.log-length
?)events
-distributed.scheduler.events-log-length
computations
-distributed.diagnostics.computations.max-history
Node._deque_handler
-distributed.admin.log-length
I propose two things:
Note that for some/most of these, that may be difficult to do accurately, since the size of the entries is unknown. A rough estimate is probably okay.
The text was updated successfully, but these errors were encountered: