GitHub - ChadGueli/bigboost: Dask+XGBoost=Fast+Big

Welcome to Big Boost

Of course, faster training allows the firm to see a ROI on data and models quicker.

To expedite model development and updates, this repo contains Distributed (MapReduce) data preprocessing and training with XGBoost and Dask in Python.

In particular, this is a simple repo with a Jupyter notebook explaining how to use Dask, Zarr, and XGBoost together. It also comes with a Docker image running an extremely basic app that makes use of the model. While this docker image is available here, the Dockerfile and everything necessary to make the image are available in this repo.

Check it out: https://github.com/ChadGueli/bigboost/blob/main/bigboost.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
bigboost.ipynb		bigboost.ipynb
chunkdag.png		chunkdag.png
dockerignore		dockerignore
environment.yml		environment.yml
fake_deploy.py		fake_deploy.py
smallmodel.txt		smallmodel.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to Big Boost

About

Releases

Packages

Languages

License

ChadGueli/bigboost

Folders and files

Latest commit

History

Repository files navigation

Welcome to Big Boost

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages