[tabular] Add memory check prior to merging out-of-fold predictions / fitting WE.

On large multiclass datasets such as `dionis` with C=355 classes, out-of-memory errors can occur in the following cases:

1. Fitting stacker models with N base models.
2. Fitting weighted ensembles with N base models.

In both cases this results in X having an extra N*C features, or an extra `N*C*4 bytes` per sample S (4 because of use of `float32`). This can lead to cases where AutoGluon works fine on lower time limits, but on higher time limits it fits more base models and then gets an out-of-memory error due to too many features in X in the stack layers.

For example, given S=1,000,000 rows of data and C=355 classes, each base model would increase memory usage of X by 1.42 GB.
On a system with 32 GB of memory, if 25 base models were used, this would require 35.5 GB of memory, leading to an out-of-memory error.

- Logic should be added that computes this expected memory usage via `S*C*4 bytes`.
- Next, the logic will define N where N is the number of base models allowed for memory safety. (The logic could be something like `N*S*C*4 bytes + X_og_mem_usage < 25% of memory`).
- Then, base models can be sorted by validation score, and the top N are used, with the rest discarded.
- The logic could mimic the logic in `AbstractTrainer` when `infer_limit` is specified.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tabular] Add memory check prior to merging out-of-fold predictions / fitting WE. #4350

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development