- About YALLA
- What's the point
- What's inside
- Should I use these implementations in production
- How to build & test
Yet Another Lame Library of Algorithms* is an ever growing collection of algorithms and data structures used in machine learning and large scale data processing, implemented from scratch for fun and profit.
* also see https://www.urbandictionary.com/define.php?term=yalla
As any field progresses, new tools and libraries are created to make our lives easier. They help us turn ideas into working products faster and with better quality. They achieve this thanks to an ever increasing level of abstraction. Unfortunately high level of abstraction comes with a cost. It becomes easy to miss or loose grasp of the inner workings of particular tools, models and algorithms.
This repository is an attempt to keep myself sharp and my feet firmly on the ground - to consolidate my understanding of the intuition behind these algorithms and encode it in the form of plain source code with minimal dependencies.
To implement is to understand.
Algorithms and data structures implemented in YALLA, are divided into packages and submodules aligned with their real life domains and applications. Each comes with:
- a (hopefully) clear and informative implementation
- short documentation
- small suite of sanity tests and
- (optionally) an accompanying Jupyter notebook
Currently the following topics are covered:
- Bloom Filter
- Cuckoo Filter
- HyperLogLog
If you like to see the world burn, then yes. Otherwise, caution is advised.
The source code of algorithms and data structures in this repository is not optimized for efficiency or accuracy. It is intentionally as pure and pseudo-code-like as possible. It may also ignore certain corner-cases for the sake of simplicity. The primary purpose of this library is to serve as a repository of reference implementations, to aid and maintain understanding.
- (Re)create environment:
source recreate_environment.sh
- Build, install & test:
python setup.py install test