Skip to content

ChadGueli/bigboost

Repository files navigation

Welcome to Big Boost

Of course, faster training allows the firm to see a ROI on data and models quicker.

To expedite model development and updates, this repo contains Distributed (MapReduce) data preprocessing and training with XGBoost and Dask in Python.

In particular, this is a simple repo with a Jupyter notebook explaining how to use Dask, Zarr, and XGBoost together. It also comes with a Docker image running an extremely basic app that makes use of the model. While this docker image is available here, the Dockerfile and everything necessary to make the image are available in this repo.

Check it out: https://github.com/ChadGueli/bigboost/blob/main/bigboost.ipynb

About

Dask+XGBoost=Fast+Big

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published