Prof. Marek Gagolewski
My research interests include data science, machine learning, data aggregation and clustering, computational and applied statistics, and mathematical modelling (the science of science, sport, economics, social sciences, psychometrics, bibliometrics, etc.). In my spare time, I write books for my students and develop open-source data analysis software.
-
Deep R Programming (HTML) (PDF) (paper copy) (GitHub)
-
Minimalist Data Wrangling in Python (HTML) (PDF) (paper copy) (GitHub)
- genieclust – Fast and robust hierarchical clustering with noise point detection (GitHub) (PyPI) (paper)
- clustering-benchmarks – A framework for benchmarking clustering algorithms (GitHub) (PyPI) (paper)
- stringi – Fast and portable character string processing in R (one of the most often downloaded packages for R) (GitHub) (CRAN) (paper)
- genieclust – Fast and robust hierarchical clustering with noise point detection (GitHub) (CRAN) (paper)
- stringx – Drop-in replacements for base R string functions powered by stringi (GitHub) (CRAN)
- realtest – Where expectations meet reality: Realistic unit testing in R (GitHub) (CRAN)
- TurtleGraphics – Learn computer programming in R while having a jolly time! (GitHub) (CRAN)
- Clustering benchmarks (framework, datasets, results)
- Datasets for teaching