From 18662180ab149a392367ff14f316f6a8f573273b Mon Sep 17 00:00:00 2001 From: Mani Sarkar Date: Thu, 15 Jul 2021 22:57:34 +0100 Subject: [PATCH] Adding new links to various categories and sections: NLP, Things to know, Visualisation, competitions, Python programming related and many others, more updates to follow. --- Programming-in-Python.md | 16 +- Python-Performance.md | 2 +- cloud-devops-infra/README.md | 2 + competitions.md | 13 +- data/README.md | 18 +- data/data-exploratory-analysis.md | 13 ++ data/data-generation.md | 3 + data/data-preparation.md | 2 + data/databases.md | 6 + data/datasets.md | 3 + data/feature-engineering.md | 6 +- ...-analysis-interpretation-explainability.md | 29 +++ details/julia-python-and-r.md | 5 + .../reinforcement-learning.md | 5 +- details/maths-stats-probability.md | 27 ++- details/visualisation.md | 20 ++ .../analysis/01_ComparingResults.ipynb | 2 +- .../analysis/02_EnsembleTribuoResults.ipynb | 2 +- .../03_EnsembleTribuoDeepNettsResults.ipynb | 2 +- ...deepnetts-linear-regression-validation.csv | 200 +++++++++--------- ...ribuo-linear-regression-ada-validation.csv | 200 +++++++++--------- ...ibuo-linear-regression-cart-validation.csv | 200 +++++++++--------- ...ribuo-linear-regression-sgd-validation.csv | 200 +++++++++--------- ...ribuo-linear-regression-xgb-validation.csv | 200 +++++++++--------- natural-language-processing/README.md | 5 + things-to-know.md | 1 + 26 files changed, 668 insertions(+), 514 deletions(-) diff --git a/Programming-in-Python.md b/Programming-in-Python.md index 8fe0af2c..78bf86aa 100644 --- a/Programming-in-Python.md +++ b/Programming-in-Python.md @@ -7,7 +7,7 @@ - [Focussed packages](#focussed-packages) - [Python Wrappers](#python-wrappers) - [Cookie cutter: Python project templates](#cookie-cutter-python-project-templates) -- [Frameworks](#frameworks) +- [Libraries and Frameworks](#libraries-and-frameworks) - [Best practices](#best-practices) - [Testing](#testing) - [Refactoring](#refactoring) @@ -59,6 +59,7 @@ - [5 free books for learning Python for DS](https://towardsdatascience.com/5-free-books-for-learning-python-for-data-science-87be443c084) - [7 advanced tricks in pandas for data science](https://www.linkedin.com/posts/towards-data-science_7-advanced-tricks-in-pandas-for-data-science-activity-6655303741224423424-SJtU) - [Sqlite saving numpy serialised into the database](https://github.com/ebmdatalab/openprescribing/tree/master/openprescribing/matrixstore) +- [Beyond the Basic Stuff with Python 2020 PDF Course! Free!](https://sites.google.com/view/beyond-the-basic-stuff-with-py/home) | [Python Books](https://theappsblaster.com/?cat=306) ## Courses @@ -72,6 +73,7 @@ See **Python: Best practices** and **Python: Testing** under [Courses](./courses - [Python for Data Science](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/PythonForDataScience.pdf) - [30 seconds of python](https://github.com/30-seconds/30-seconds-of-python) - [Comprehensive Python cheatsheet](https://www.linkedin.com/posts/ashishpatel2604_comprehensive-python-cheatsheet-activity-6685556002110152704-mInG) +- [Regex symbols](https://regex101.com/) ## Database @@ -121,7 +123,7 @@ See **Python: Best practices** and **Python: Testing** under [Courses](./courses - [For Reproducible Data Science projects](https://cookiecutter.readthedocs.io/en/latest/readme.html#reproducible-science) - [For Data Driven Journalism projects](https://cookiecutter.readthedocs.io/en/latest/readme.html#data-driven-journalism) -## Frameworks +## Libraries and Frameworks - [Rich is a Python library for writing rich text with color and style to the terminal and for displaying advanced content such as tables, markdown, and syntax highlighted code!](https://www.linkedin.com/feed/update/urn:li:activity:6695712483468017664/) - [Python for MicroControllers](https://micropython.org/) @@ -156,6 +158,10 @@ with nothing but Python - [๐Ÿ—ฝ ๐™‚๐™ง๐™–๐™™๐™ž๐™ค ๐™ฅ๐™ฎ๐™ฉ๐™๐™ค๐™ฃ ๐™ก๐™ž๐™—๐™ง๐™–๐™ง๐™ฎ : ๐™ƒ๐™–๐™จ๐™จ๐™ก๐™š-๐™๐™ง๐™š๐™š ๐™Ž๐™๐™–๐™ง๐™ž๐™ฃ๐™œ ๐™–๐™ฃ๐™™ ๐™๐™š๐™จ๐™ฉ๐™ž๐™ฃ๐™œ ๐™ค๐™› ๐™ˆ๐™‡ ๐™ˆ๐™ค๐™™๐™š๐™ก๐™จ ๐™ž๐™ฃ ๐™ฉ๐™๐™š ๐™’๐™ž๐™ก๐™™](https://www.linkedin.com/posts/ashishpatel2604_machinelearning-gui-python-activity-6691757766748504064-sPCX) - [The Python scientific stack, compiled to WebAssembly.](https://alpha.iodide.io/) [GitHub](https://github.com/iodide-project/pyodide) - [A simple video that explains in a very simple way how you can use joblib to speed up almost any function](https://www.youtube.com/watch?v=Ny3O4VpACkc) +- [pyforest: feel the bliss of automated imports](https://pypi.org/project/pyforest/) +- [How to be Pythonic? Design a Query Language in Python](https://dev.to/terminusdb/extending-prolog-terminusdb-discussion-10-3p90) +- [prython](http://www.prython.com) - a novel IDE for Python and R also both together in one workflow! It allows you to put your code inside panels that you can connect and run. Its like Jupyter Notebook but with the possibility of multiple streams +- [Syntax Trees and Python - Automated Code Transformations - PyCon 2019](https://www.youtube.com/watch?v=viNzD1zD-Fg) ## Best practices @@ -177,6 +183,9 @@ with nothing but Python - [Code Craft : Part III โ€“ Unit Tests are an Early Warning System for Programmers](https://codemanship.wordpress.com/2019/10/04/code-craft-part-iii-unit-tests-are-an-early-warning-system-for-programmers/) - ["Stop writing classes"](https://www.youtube.com/watch?v=o9pEzgHorH0โ€ฉ) - [How to package Python apps with BeeWare Briefcase](https://www.infoworld.com/article/3570295/how-to-package-python-apps-with-beeware-briefcase.html?utm_medium=email&utm_source=topic+optin&utm_campaign=awareness&utm_content=20200815+prog+nl&mkt_tok=eyJpIjoiWkRjd09XTmhZV0ppTnpBeSIsInQiOiJNTkNUTmpJZlB0REdcL2E0b3VBVlZKTlhHTCtuckZEQ25rREpIc3VtakFmdFB6UlZhUFIrMnNlaERrOXpmWFAzNUpYWVBXNEZXZWVaRmtjTDRURFY5ZlJWNHF0N2YwR01hUmlYaFQwd052a2pycjRZaWdReG16OEVYRmRZbTVOOGkifQ%3D%3D) +- [Teaching Clean Code](https://ceur-ws.org/Vol-2066/isee2018paper06.pdf) +- [Code Process Metrics in University Programming Education](https://ceur-ws.org/Vol-2308/isee2019paper05.pdf (paper with Adam Thornhill) +- [Remote Mob Programming www.remotemobprogramming.org (also on Amazon and Leanpub)](https://java.by-comparison.com/favor-constructor-over-field-injection.html) ## Versioning @@ -208,6 +217,9 @@ See [Machine Learning Testing](./details/julia-python-and-r.md#testing) - [Learning Python with PyCharm: Refactoring](https://www.lynda.com/Python-tutorials/Refactoring/590828/629432-4.html) - [What refactoring tools do you use for Python?](https://stackoverflow.com/questions/28796/what-refactoring-tools-do-you-use-for-python) - [Bowler: Safe code refactoring for modern Python projects](https://github.com/facebookincubator/Bowler) - Bowler is a refactoring tool for manipulating Python at the syntax tree level. It enables safe, large scale code modifications while guaranteeing that the resulting code compiles and runs. +- [Beautiful Python Refactoring](https://www.youtube.com/watch?v=KTIl1MugsSY) +- [Transforming Code into Beautiful, Idiomatic Python](https://www.youtube.com/watch?v=OSGv2VnC0go) +- [Professional Code Refactor! (Cleaning Python Code & Rewriting it to use Classes)](https://www.youtube.com/watch?v=731LoaZCUjo) ## Performance diff --git a/Python-Performance.md b/Python-Performance.md index b39f4013..deb0b244 100644 --- a/Python-Performance.md +++ b/Python-Performance.md @@ -82,7 +82,7 @@ - [perfplot](https://awesomeopensource.com/project/nschloe/perfplot?categoryPage=26) | [github](https://github.com/nschloe/perfplot) - [Opytimizer โ€ข A Nature-Inspired Python Optimizer. Did you ever reach a bottleneck in your computational experiments? ](https://www.linkedin.com/posts/philipvollet_python-python3-tensorflow-activity-6693021973813055488-5Z29) - [How the CPython compiler works](https://news.ycombinator.com/item?id=24565499) -- High Performance Python talk by [Ian Oszvald](https://twitter.com/ianozsvald/): Blogs: [1](https://ianozsvald.com/2019/11/16/higher-performance-python-at-pydatacambridge-2019/) o [2](https://ianozsvald.com/2019/11/22/higher-performance-python-odsc-2019/) | [Slides](https://speakerdeck.com/ianozsvald/higher-performance-python-odsc-2019) | [Useful resources shared](https://twitter.com/DataChaz/status/1197608275606413312) +- High Performance Python talk by [Ian Oszvald](https://twitter.com/ianozsvald/): Blogs: [1](https://ianozsvald.com/2019/11/16/higher-performance-python-at-pydatacambridge-2019/) o [2](https://ianozsvald.com/2019/11/22/higher-performance-python-odsc-2019/) | [Slides](https://speakerdeck.com/ianozsvald/higher-performance-python-odsc-2019) | [Useful resources shared](https://twitter.com/DataChaz/status/1197608275606413312) | [Python Performance 2nd Edition git repo](https://github.com/mynameisfiber/high_performance_python_2e) - [Making Pandas Fly (EuroPython 2020)](https://speakerdeck.com/ianozsvald/making-pandas-fly-europython-2020) | [Blog](https://ianozsvald.com/2020/07/24/making-pandas-fly-at-europython-2020/) - [Making Pandas Fly (PyDataAmsterdam 2020)](https://speakerdeck.com/ianozsvald/making-pandas-fly-pydataamsterdam-2020) | [Blog](https://ianozsvald.com/2020/06/23/making-pandas-fly-for-pydataamsterdam-2020/) - [Making Pandas Fly (PyDataUK 2020)](https://speakerdeck.com/ianozsvald/pydatauk-making-pandas-fly) | [Blog](https://ianozsvald.com/2020/04/27/flying-pandas-and-making-pandas-fly-virtual-talks-this-weekend-on-faster-data-processing-with-pandas-modin-dask-and-vaex/) diff --git a/cloud-devops-infra/README.md b/cloud-devops-infra/README.md index c4c66640..5a4aea59 100644 --- a/cloud-devops-infra/README.md +++ b/cloud-devops-infra/README.md @@ -56,6 +56,8 @@ reproducible research - [Workshop: Large Scale Deep Learning Recommender](https://bit.ly/RE_streaming) - [Reality Engines Demo](https://github.com/jsutch/RealityEngines-Demo) - [Accelerating AI Training with MLPerf Containers and Models from NVIDIA NGC](https://developer.nvidia.com/blog/accelerating-ai-training-with-mlperf-containers-and-models-from-ngc/?ncid=so-elev-58408#cid=ngc01_so-elev_en-us&_lrsc=3642f913-311f-45b0-bbc5-158e51446637&ncid=so-lin-lt-798) + - Running AI Models in the Cloud: [site](https://www.scailable.net/) | [video](https://youtu.be/PDXaDTnAN2M?t=2570) | [Docs](https://docs.sclbl.net/sclblpy) + | [Getting started](https://github.com/scailable/sclbl-tutorials/tree/master/sclbl-101-getting-started) ## Tools diff --git a/competitions.md b/competitions.md index 0a46e035..c9b26b38 100644 --- a/competitions.md +++ b/competitions.md @@ -45,13 +45,22 @@ - [Hacker Rank](https://lnkd.in/gEufBUu) - [Codeacademy](https://lnkd.in/gGQ7cuv) - [LeetCode](https://leetcode.com/) +- [Codechef Competitive Programming: Problem statements and solutions provided by people on the codechef site](https://www.kaggle.com/arjoonn/codechef-competitive-programming) ## Resources - [Kaggle Kernels Guide for Beginners โ€” Step by Step Tutorial](https://towardsdatascience.com/kaggle-kernels-for-beginners-a-step-by-step-guide-3db6b1cd7606) | [Best Data Scientists on Kaggle from 2011-2020](https://www.youtube.com/watch?v=guLZ_2WcEqM) | [What Kaggle has learned from almost a million data scientists - Anthony Goldbloom (Kaggle)](https://www.youtube.com/watch?v=jmHbS8z57yI) - [Getting โ€˜Moreโ€™ out of your Kaggle Notebooks.](https://www.linkedin.com/posts/parulpandeyindia_getting-more-out-of-your-kaggle-notebooks-activity-6703281576970592256-rGPg) - [Tackling any Kaggle Competition : The Noob's Way](https://www.kaggle.com/tanulsingh077/tackling-any-kaggle-competition-the-noob-s-way) | [Mr_KnowNothing-s-Weekends](https://github.com/tanulsingh/Mr_KnowNothing-s-Weekends) - +- Kaggle related blogs (plus links to kernels) by https://tkravichandran.github.io/ + - [Top 10% solution Detailed version on my blog](https://tkravichandran.github.io/my-fast-ds-blog/first-tabular-kaggle-competition.html) + - [Top 10% solution short version for kaggle](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220874) + - [What works in feature engineering ](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220878) + - [How to know if you are actually overfitting](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220877) + - [What can Adverserial Validation do for you?](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220876) + - Cross-posted all from [my DS blog](https://tkravichandran.github.io/my-fast-ds-blog/). Fraud Detection Competition page is [here](https://www.kaggle.com/c/ieee-fraud-detection/overview) + - https://www.kaggle.com/c/ieee-fraud-detection/discussion/111510 + - https://www.kaggle.com/cdeotte/xgb-fraud-with-magic-0-9600 - RAPIDS - [RAPIDS in Kaggle competition](https://www.kaggle.com/cdeotte/rapids/) [LinkedIn](https://www.linkedin.com/posts/miguelusque_kaggle-rapids-gpu-activity-6628421575299383297-Ifuu) - [Here is the first ever successful implementation of NVIDIA #rapids library in a Kaggle kernel. It achieves 600X speedup of the kNN as compared to #sklearn](https://www.kaggle.com/cdeotte/rapids-gpu-knn-mnist-0-97) [LinkedIn](https://www.linkedin.com/posts/tunguz_rapids-sklearn-ml-activity-6626833143032885248-XQA6) @@ -61,6 +70,8 @@ - [Tips N Tricks #3: Creating a clean inference kernel/notebook on Kaggle](https://www.youtube.com/watch?v=C7Tsfrq_g18) - [Interview with Abhishek Thakur | World's First Triple Grandmaster | Kaggle](https://www.youtube.com/watch?v=8lniZVqRLA0) - [My journey to 4x GM on Kaggle](https://www.youtube.com/watch?v=z15TKkAPNUM) +- [Grandmaster Series โ€“ How to Build a World-Class ML Model for Melanoma Detection](https://www.youtube.com/watch?v=L1QKTPb6V_I) +- [Grandmasters Series - How to Perform Large-Scale Image Classification](https://www.youtube.com/watch?v=VxNDH6qLZ_Q) - Also see [NVIDIA's RAPIDS](./cloud-devops-infra/gpus/rapids.md#rapids) # Contributing diff --git a/data/README.md b/data/README.md index d16b5c4e..c35bbacc 100644 --- a/data/README.md +++ b/data/README.md @@ -16,9 +16,10 @@ The question to ask ourselves: _Do we know our data...?_ + [Data Cleaning](./data-preparation.md#data-cleaning) + [Data preprocessing / Data Wrangling](./data-preparation.md#data-preprocessing--data-wrangling) - [Data Generation](./README.md#data-generation) -- [Feature Selection](./README.md#feature-selection) +- [Feature Extraction](./README.md#feature-extraction) - [Feature Importance](./README.md#feature-importance) - [Feature Engineering](./README.md#feature-engineering) +- [Feature Selection](./README.md#feature-selection) - [Hyperparameter tuning](#hyperparameter-tuning) - [Post model-creation analysis, ML interpretation/explainability](./README.md#post-model-creation-analysis-ml-interpretationexplainability) - [Model deployment](./README.md#model-deployment) @@ -65,6 +66,7 @@ See [Ethics / altruistic motives](../README-details.md#ethics--altruistic-motive - [Data Exploration and API First Design: Deep Learning Hands-On Series with Eric Schles](https://gist.github.com/lidderupk/f6562beadd39406a033c738201f46c12) - [Augmented Analytics Engine](https://www.linkedin.com/posts/data-science-central_augmented-analytics-engine-activity-6648764149864153088-dZWX) - [Putting an end to Unreliable Analytics by David Yaffe](https://www.linkedin.com/posts/towards-data-science_putting-an-end-to-unreliable-analytics-activity-6717020155261587456-0hyA) +- The Fundamentals of end-to-end Data Strategy: [video](https://www.youtube.com/watch?v=hAE12zICkLI&feature=youtu.be) | [slides](https://drive.google.com/drive/folders/1LV_gP1muLbbXesJISrTqebpyrpRfTjbq?usp=sharing) | [Resources](http://nicolejaneway.com/data-strategy/resources/) | [Feedback](https://docs.google.com/forms/d/e/1FAIpQLSfZsLIIdFJSS_fRwzj_trDF_iM6-EnfPT329GfCj-tPNr_DJA/viewform) ## Datasets and sources of raw data @@ -100,9 +102,10 @@ See [Data Exploratory Analysis](./data-exploratory-analysis.md) See [Data Generation](./data-generation.md#data-generation) -## Feature Selection +## Feature Extraction -See [Feature Selection](./feature-selection.md) +- [Hierarchical Feature Extraction for Compact Representation and Classification of Datasets](doc.ml.tu-berlin.de/publications/publications/SchKoh08.pdf) +- [Guide to Feature Extraction Approaches for Text Data](https://rumankhan1.medium.com/guide-to-feature-extraction-approaches-for-text-data-1ebdcc4b9834) ## Feature Importance @@ -116,16 +119,25 @@ See [Feature Selection](./feature-selection.md) - [The 4 types of additive Feature Importances](https://twitter.com/TDataScience/status/1264958410405171202) - [The Math of Random Forests and Feature Importance in Scikit-learn and Spark](https://www.linkedin.com/posts/data-science-central_the-math-of-decision-trees-random-forest-activity-6656775689431240705-kwf_) - Path Explain - toolkit for feature attributions: [GitHub](https://github.com/suinleelab/path_explain) | [PyPI](https://pypi.org/project/path-explain/) | [Path Explain on MWML](https://madewithml.com/projects/1931/path-explain/) +- [Open Machine Learning Course: Feature Importance](https://mlcourse.ai/articles/topic5-part3-feature-importance/) ## Feature engineering See [Feature engineering](./feature-engineering.md) +## Feature Selection + +See [Feature Selection](./feature-selection.md) + ## Hyperparameter tuning - [Ray Tune Sweeps](https://docs.wandb.com/sweeps/ray-tune) - [W&B Sweeps](https://docs.wandb.com/sweeps) - [Automated Machine Learning Hyperparameter Tuning in Python](https://www.linkedin.com/posts/vincentg_automated-machine-learning-hyperparameter-activity-6693176296077348864-7Ihf) +- Bayesian hyperparameter optimisation by Akinkunle: [Original Notebook](https://colab.research.google.com/drive/1akyJnd7O-lqA5I8mbR2gDG_VsjOBSp05?usp=sharing) | [Saved Notebook](https://colab.research.google.com/drive/1Nic2155ulaYDRrGUPDKNhY49j2xrZ99B) | [Slides](https://www.dropbox.com/sh/q0v0k3ida37thyn/AAB6wXMge7C6fvqKIZmGFXVQa?dl=0) +- [Hyperparameter optimization for Neural Networks](http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html#id24) +- [Tune Hyperparameters Easily with W&B Sweeps](https://www.youtube.com/watch?v=9zrmUIlScdY) + ## Post model-creation analysis, ML interpretation/explainability - [Pruning: DL models](https://www.subhadityamukherjee.me/2020/09/25/Pruning.html) diff --git a/data/data-exploratory-analysis.md b/data/data-exploratory-analysis.md index 03e04c69..f0d2c02e 100644 --- a/data/data-exploratory-analysis.md +++ b/data/data-exploratory-analysis.md @@ -24,11 +24,15 @@ aka *_Exploratory Data Analysis_* - [Data Analysis Method: Mathematics Optimization to Build Decision Making](https://www.linkedin.com/posts/data-science-central_data-analysis-method-mathematics-optimization-activity-6661023930838503425-bJtJ) - Dataprep.eda: Accelerate your EDA: [original post](https://www.linkedin.com/posts/towards-data-science_dataprepeda-accelerate-your-eda-activity-6655814355810168832-RTPQ) | [blog](https://towardsdatascience.com/dataprep-eda-accelerate-your-eda-eb845a4088bc) - [Having a bunch of data and no idea what's in? What is the best tool for a fast Exploratory Data Analysis? Check this EDA Comparison: Pandas Profiling, Sweetviz, and PandasGUI](https://www.linkedin.com/posts/philipvollet_data-analytics-bigdata-activity-6726856744833835008-NnEY) +- [Pandas is great for most day-to-day data analysis](https://github.com/chiphuyen/just-pandas-things) +- [The Exploratory Data Analysis (EDA) lesson is out for Made With ML's Applied ML in Production course!](https://www.linkedin.com/posts/goku_exploratory-data-analysis-applied-ml-in-activity-6734809864322871296-FDyJ) + ### Tools - [Pandas Profiling](https://pandas-profiling.github.io/pandas-profiling/) - [Dabl: Data Analysis Baseline Library (Pandas profiling like tool)](https://dabl.github.io/dev/) | [GitHub](https://github.com/dabl/dabl) +- [DataProfiler](https://github.com/capitalone/DataProfiler) - [Bamboolib](./bamboolib.md) - [ppscore - a Python implementation of the Predictive Power Score (PPS)](https://pypi.org/project/ppscore/) - from the makers of [Bamboolib](./bamboolib.md) - [CleverCSV](https://github.com/alan-turing-institute/CleverCSV) @@ -39,6 +43,11 @@ aka *_Exploratory Data Analysis_* - [Great expectations (profiling data)](https://github.com/great-expectations/great_expectations) - [DTale: Web Client for Visualizing Pandas Objects](https://pypi.org/project/dtale/) | [LinkedIn post](https://www.linkedin.com/posts/philipvollet_datascience-python-jupyter-activity-6729298029536514048-F4Ad) - Intake: Intake is a lightweight set of tools for loading and sharing data in data science projects. [GitHub](https://github.com/intake/intake) | [Docs](https://intake.readthedocs.io/en/latest/) +- [D-Tale your GUI for pandas dataframes. It's like Excel for pandas! Your new tool for super fast Exploratory Data Analysis (EDA)](https://www.linkedin.com/posts/philipvollet_datascience-python-jupyter-activity-6729298029536514048-F4Ad) +- [A miscellaneous repo where @clone95 put my studies on time series data analysis and forecasting](https://github.com/clone95/time-series-misc) +- [If you are starting with machine learning / deep learning and get a new dataset to work on, there are a few things you must always take care of](https://www.linkedin.com/posts/abhi1thakur_if-you-are-starting-with-machine-learning-activity-6766721312451772416-RC6y) +- [Bayesian Data Analysis](https://avehtari.github.io/BDA_course_Aalto/) + - See [Data > Programs and Tools](./programs-and-tools.md#programs-and-tools) and [Things to know: Primary tools to analyse data](../things-to-know.md#primary-tools-to-analyse-data) ### Missing values @@ -76,6 +85,10 @@ aka *_Exploratory Data Analysis_* - [10 Clustering Algorithms With Python](https://machinelearningmastery.com/clustering-algorithms-with-python/) - [An Introduction to Clustering and different methods of clustering](https://www.linkedin.com/posts/data-science-central_an-introduction-to-clustering-and-different-activity-6657823846013419520-3o5Y) - [Scale-Invariant Clustering and Regression](https://www.linkedin.com/posts/data-science-central_scale-invariant-clustering-and-regression-activity-6657477059243229184-1ibv) +- Hands-on ML with Python: Clustering, Dim Reduction, Time Series Analysis: + [GitHub](https://resources.oreilly.com/binderhub/machine-learning-with-python-clustering) | [Jupyter notebook](https://learning.oreilly.com/jupyter-notebooks/hands-on-machine-learning/9781492063179/) +- [Disadvantages of KMeans Clustering](https://www.inovex.de/blog/disadvantages-of-k-means-clustering/) +- Clustering workshop on Uni. of Waterloo Discord channel: [Slideshow](https://bit.ly/3bDy1SW() [Google Doc](https://docs.google.com/presentation/d/1bXvU-IImwZyNGeGhXysX0q-pMrKfG7Esj7ZhhEY2d88/edit#slide=id.gc6d1cf1e32_0_147) |[Notebook](https://bit.ly/3bIaWP3 (https://colab.research.google.com/drive/11Gb-6M8DZNNp04zfyKI3thoZQWm2_1Dk#scrollTo=RqFP72NgNAq7) | [Video](https://youtu.be/127zPeHsFpU) ### Outliers diff --git a/data/data-generation.md b/data/data-generation.md index ddb07497..7afaeb59 100644 --- a/data/data-generation.md +++ b/data/data-generation.md @@ -5,6 +5,9 @@ - [Website to generate syntheric data](https://www.mockaroo.com) - [Synthetic data generationโ€Šโ€”โ€Ša must-have skill for new data scientists](https://towardsdatascience.com/synthetic-data-generation-a-must-have-skill-for-new-data-scientists-915896c0c1ae) - [Surprising Uses of Synthetic Random Data Sets](https://www.linkedin.com/posts/data-science-central_surprising-uses-of-synthetic-random-data-activity-6612404601515765760-J0AY) +- [Synthetic Data Vault (SDV): A Python Library to generate complex datasets using statistical & machine learning models](https://www.linkedin.com/posts/philipvollet_machinelearning-datascience-deeplearning-activity-6737635714676219904-CKE0) +- [zpy: Synthetic data in Blender โ€ข Collecting, labeling, and cleaning data](https://www.linkedin.com/feed/update/urn:li:activity:6787600161246969856/) +- Synthetic Data for Deep Learning: Sergey I. Nikolenko: [Post 1](https://lnkd.in/dc8DtSw) | [Post 2](https://www.linkedin.com/posts/montrealai_deeplearning-generativemodels-machinelearning-activity-6685175468045418496-XHVp) - [Python Random Data Generation](https://honingds.com/blog/python-random/) - [Random Number Generation and Sampling Methods](https://www.codeproject.com/Articles/1190459/Random-Number-Generation-and-Sampling-Methods) - [How to Generate Test Datasets in Python with scikit-learn](https://machinelearningmastery.com/generate-test-datasets-python-scikit-learn/) diff --git a/data/data-preparation.md b/data/data-preparation.md index ecba4520..e9a347b6 100644 --- a/data/data-preparation.md +++ b/data/data-preparation.md @@ -21,6 +21,7 @@ - [Working with missing data](https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html) - [Missing values in a dataset](https://www.datasciencecentral.com/profiles/blogs/how-to-treat-missing-values-in-your-data-1) - [Iterative Imputation for Missing Values in Machine Learning](https://machinelearningmastery.com/iterative-imputation-for-missing-values-in-machine-learning/) +- [MissForest: The Best Missing Data Imputation Algorithm?](https://towardsdatascience.com/missforest-the-best-missing-data-imputation-algorithm-4d01182aed3) ### Imbalanced data @@ -44,6 +45,7 @@ - [7 Pandas Functions to Reduce Your Data Manipulation Stress by Andre Ye](https://towardsdatascience.com/7-pandas-functions-to-reduce-your-data-manipulation-stress-25981e44cc7d) [LinkedIn Post](https://www.linkedin.com/posts/towards-data-science_7-pandas-functions-to-reduce-your-data-manipulation-activity-6655006784069214208-R9Zn) - [Data Prepossessing](https://www.linkedin.com/posts/nabihbawazir_datascience-machinelearning-aertificialintellegence-activity-6657362083665018880-2rzc) - [How to Grid Search Data Preparation Techniques](https://machinelearningmastery.com/grid-search-data-preparation-techniques/) +- [How to Choose Data Preparation Methods for Machine Learning](https://machinelearningmastery.com/choose-data-preparation-methods-for-machine-learning/) ### Transformations diff --git a/data/databases.md b/data/databases.md index 9335688b..e7742c42 100644 --- a/data/databases.md +++ b/data/databases.md @@ -38,6 +38,12 @@ - [Auto-Generated KG](https://www.linkedin.com/posts/bo-li-8503b896_auto-generated-knowledge-graphs-activity-6637543428051828736-jVdT) - [Graph Convolutional Neural Networks for Molecule Generation | NTU Graph Deep Learning Lab](https://www.linkedin.com/posts/eric-feuilleaubois-ph-d-43ab0925_graph-convolutional-neural-networks-for-molecule-activity-6640244313009737728-IdCP) +## Tools, packages and frameworks +- [PyTorch Geometric Temporal is a temporal (dynamic) extension library for PyTorch Geometric](https://github.com/benedekrozemberczki/pytorch_geometric_temporal) +- [Design Space for Graph Neural Networks](https://www.linkedin.com/posts/philipvollet_nlp-pytorch-datascience-activity-6734932885041762304-ohkW) +- [A new SOTA open source semantic annotator โ€˜bbwโ€™ for tabular data with the Wikidata knowledge graph](https://www.linkedin.com/posts/philipvollet_datascience-machinelearning-wikipedia-activity-6738004505474035712-mxpr) +- [GraphScope is a unified distributed graph computing platform](https://www.linkedin.com/posts/philipvollet_datascience-analytics-bigdata-activity-6793219722117820417-vhSv) +- [Tweeki - Linking Named Entities on Twitter to a Knowledge Graph](https://www.linkedin.com/posts/philipvollet_twitter-data-datascience-activity-6733786978178990080-ZiD3) ## Misc. - [Difference between JOIN and UNION in SQL](https://www.geeksforgeeks.org/difference-between-join-and-union-in-sql/) diff --git a/data/datasets.md b/data/datasets.md index dd970ecb..dfec501f 100644 --- a/data/datasets.md +++ b/data/datasets.md @@ -56,6 +56,9 @@ - [Open Graph Benchmark: Datasets for Machine Learning on Graphs -](https://www.linkedin.com/posts/philipvollet_machinelearning-datascience-analytics-activity-6715867835287109633-Y_MN) - [PaySim on kaggle for Money Laundering dataset(s)](https://www.kaggle.com/ntnu-testimon/paysim1) - [Sklearn provides direct access to openml datasets which hosts around 20,000 datasets and you can access it directly in your python code](https://lnkd.in/g-YYFay) [LinkedIn Post](https://www.linkedin.com/posts/srivatsan-srinivasan-b8131b_datascience-machinelearning-ml-activity-6653512803644768256-w1mM) +- [CNN news Dataset](https://www.tensorflow.org/datasets/catalog/cnn_dailymail) +- [Off-the-Shelf Datasets (licensable datasets to jumpstart your AI projects) (Commercial)](https://appen.com/off-the-shelf-datasets/?utm_source=Web&utm_medium=eblast&mkt_tok=eyJpIjoiWldaak9EUTBObU5oTm1aaiIsInQiOiJKdWlEQVhqclNlcUpNWVhGVW5GT2p2aFpRRjVlZkUyOGZjVHhYSHpsUnNuZkhGVG5rNCtTdm92REdTOXhqTlc0RW1jUlFObnpmM3RaQ3pOZ3RBQ3ArWGZ3RFc4Mk1lRk5FS3d1YklYSnFXeWJROHE2ek9pNW5nU3pCa0gxeExPRCJ9) + ## Courses diff --git a/data/feature-engineering.md b/data/feature-engineering.md index 6ca2b4f0..7e09c536 100644 --- a/data/feature-engineering.md +++ b/data/feature-engineering.md @@ -1,10 +1,11 @@ # Feature Engineering -## General +## General + - [What works in feature engineering](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220878) + [Basic Feature Engineering With Time Series Data in Python](http://machinelearningmastery.com/basic-feature-engineering-time-series-data-python/) + [Zillow Prize - EDA, Data Cleaning & Feature Engineering](https://www.kaggle.com/lauracozma/eda-data-cleaning-feature-engineering) + [Feature-wise transformations](https://distill.pub/2018/feature-wise-transformations) - + [tsfresh](https://tsfresh.readthedocs.io/en/latest/text/introduction.html) - tsfresh is used to to extract characteristics from time series + + tsfresh: used to to extract characteristics from time series: [github](https://github.com/blue-yonder/tsfresh) | [Introduction](https://tsfresh.readthedocs.io/en/latest/text/introduction.html) | [Docs](https://tsfresh.readthedocs.io/en/latest/) + [featuretools](https://github.com/featuretools/featuretools/) - an open source python framework for automated feature engineering + [5 Steps to correctly prepare your data for your machine learning model](https://towardsdatascience.com/5-steps-to-correctly-prep-your-data-for-your-machine-learning-model-c06c24762b73?gi=6b4a6895ab1) + [scikit learn's SelectKBest](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectKBest.html) @@ -23,6 +24,7 @@ - [How to Use Polynomial Feature Transforms for Machine Learning](https://machinelearningmastery.com/polynomial-features-transforms-for-machine-learning/) - [Transforming Quantitative Data to Qualitative Data](https://www.linkedin.com/feed/update/urn:li:activity:6674858845854019584/) - [Feature Engineering and Selection: A Practical Approach for Predictive Models](https://www.feat.engineering/) + - [Feature Engineering for Machine Learning: A Comprehensive Overview](https://trainindata.medium.com/feature-engineering-for-machine-learning-a-comprehensive-overview-a7ad04c896f8) ## Dimensionality Reduction diff --git a/data/model-analysis-interpretation-explainability.md b/data/model-analysis-interpretation-explainability.md index d9d03fb4..d908ef68 100644 --- a/data/model-analysis-interpretation-explainability.md +++ b/data/model-analysis-interpretation-explainability.md @@ -21,7 +21,29 @@ - [What-if-tool on GitHub](https://github.com/PAIR-code/what-if-tool) - [useR2020! Keynote: "Responsible Automation: Towards Interpretable & Fair AutoML"](https://github.com/ledell/useR2020-automl) - [Explainable AI by IBM](https://github.com/Trusted-AI/AIX360) | [GitHub](https://github.com/IBM/lale) | [(video, slides, codes) on youtube channel](https://www.youtube.com/channel/UCj09XsAWj-RF9kY4UvBJh_A) | [GitHub](https://github.com/decentdilettante) +SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. +https://www.linkedin.com/posts/philipvollet_machinelearning-datascience-technology-activity-6732555181315227648-NPD1 +Intro to Explainable AI +https://www.linkedin.com/posts/activity-6679735518781153280-RewD + +Gradio! You can now generate interpretations with one line of code! +https://www.linkedin.com/posts/philipvollet_machinelearning-python-datascience-activity-6730046018248962048-Icgv + +LIME for auditing black-box models +https://towardsdatascience.com/lime-for-auditing-black-box-models-b97d6d2580b4?gi=95ed4978e936 + +Interpretable Machine Learning - Christoph Molnar +https://www.youtube.com/watch?v=0LIACHcxpHU&t=3533s + +AI Ethics, Fairness, Explainability: Q&A and discussion at this session: +code lab: https://github.com/decentdilettante/XAI +10:33:08 https://github.com/Trusted-AI/AIX360 +10:33:19 https://github.com/IBM/lale +10:39:20 ["Conversational Processes and Causal Explanation" by Hilton:](https://pdfs.semanticscholar.org/5093/4979694fb48e55d0cf38888f67b84ad6601b.pdf + +Tech talk: Explainable anomaly detection +https://www.youtube.com/watch?v=0p8o3uj96Uc&feature=push-u-sub&attr_tag=ccXKOv7Gba4BJCOf%3A6 ## Articles, blog posts, papers, notebooks, books, presentations @@ -69,6 +91,13 @@ - [Continual Learning and Explainable AI through Knowledge Extraction from Deep Networks](https://github.com/neomatrix369/awesome-ai-ml-dl/releases/download/v0.1/aifiancewbs.pdf) by Artur d'Avila Garcez - [Neural-Symbolic Computing: An Effective Methodology](https://github.com/neomatrix369/awesome-ai-ml-dl/releases/download/v0.1/artur_davila_garcez_neural-symbolic_computing_an_effective_methodology_1.pdf) by Artur d'Avila Garcez and others - [Trepan Reloaded: A Knowledge-driven approach to Explaining Black-box Models](https://github.com/neomatrix369/awesome-ai-ml-dl/releases/download/v0.1/tillman_paper.pdf) by Roberto Confalonieri and others + - [SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model](https://www.linkedin.com/posts/philipvollet_machinelearning-datascience-technology-activity-6732555181315227648-NPD1) + - [Intro to Explainable AI](https://www.linkedin.com/posts/activity-6679735518781153280-RewD) + - [Gradio! You can now generate interpretations with one line of code!](https://www.linkedin.com/posts/philipvollet_machinelearning-python-datascience-activity-6730046018248962048-Icgv) + - [LIME for auditing black-box models](https://towardsdatascience.com/lime-for-auditing-black-box-models-b97d6d2580b4?gi=95ed4978e936) + - [Interpretable Machine Learning - Christoph Molnar](https://www.youtube.com/watch?v=0LIACHcxpHU&t=3533s) + - AI Ethics, Fairness, Explainability: Q&A and discussion at this session: code lab: [XAI](https://github.com/decentdilettante/XAI) | [Trusted AI](https://github.com/Trusted-AI/AIX360) | [lale](https://github.com/IBM/lale) | ["Conversational Processes and Causal Explanation" by Hilton:](https://pdfs.semanticscholar.org/5093/4979694fb48e55d0cf38888f67b84ad6601b.pdf) + - [Tech talk: Explainable anomaly detection](https://www.youtube.com/watch?v=0p8o3uj96Uc&feature=push-u-sub&attr_tag=ccXKOv7Gba4BJCOf%3A6) ## Calibration diff --git a/details/julia-python-and-r.md b/details/julia-python-and-r.md index f6757d1d..8ee2241c 100644 --- a/details/julia-python-and-r.md +++ b/details/julia-python-and-r.md @@ -126,6 +126,7 @@ ## RNN - [The Unreasonable Effectiveness of Recurrent Neural Networks - Andrej Karpathy](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) + - [PsychRNN: An Accessible and Flexible Python Package for Training Recurrent Neural Network Models on Cognitive Tasks](https://www.linkedin.com/posts/philipvollet_bioinformatics-biodata-deeplearning-activity-6748882192618926080-iMUs) ## Natural Language Processing (NLP) @@ -235,6 +236,10 @@ See [Reinforcement Learning](./julia-python-and-r/reinforcement-learning.md) + ## Recommendation Systems + + - Building Recommendation Engines in Python: [Github](https://github.com/maxhumber/BRE) | [Slides](https://on24static.akamaized.net/event/24/06/92/4/rt/1/documents/resourceList1603467800300/presentation.pdf) | [KataCoda: LightFM](https://learning.oreilly.com/scenarios/scikit-learn-build-a/9781492087755/) | [Katacoda - Build an Implicit Feedback Recommendation Engine](https://learning.oreilly.com/scenarios/spotlight-build-an/9781492087748/) | [Katacoda - Scikit Learn](https://learning.oreilly.com/scenarios/lightfm-build-an/9781492087731/) + ## Programming in Python See [Programming in Python](../Programming-in-Python.md) diff --git a/details/julia-python-and-r/reinforcement-learning.md b/details/julia-python-and-r/reinforcement-learning.md index f2dfd325..91192b55 100644 --- a/details/julia-python-and-r/reinforcement-learning.md +++ b/details/julia-python-and-r/reinforcement-learning.md @@ -4,7 +4,7 @@ - [Reinforcement learning resources curated resources](https://github.com/aikorea/awesome-rl) - [Gym: a toolkit for developing and comparing reinforcement learning algorithms](https://github.com/openai/gym) - [Dopamine is a research framework for fast prototyping of Reinforcement Learning algorithms](https://github.com/google/dopamine) - - AlphaZero & Alpha Go + - AlphaZero, LeelaZero & Alpha Go - Introduction to Reinforcement learning (using AlphaZero) by [Piotr Januszewski](https://piojanu.github.io/) - [Slides](https://github.com/piojanu/Planning-in-Deep-Reinforcement-Learning) | [AlphaZero implementation](https://github.com/piojanu/AlphaZero) | [Framework for RL research (used in AZ implementation)](https://github.com/piojanu/humblerl) | [Experiment with AlphaGo online](https://alphagoteach.deepmind.com) - [A replica of the AlphaZero methodology for deep reinforcement learning in Python](https://github.com/AppliedDataSciencePartners/DeepReinforcementLearning) - [Mastering the Game of Go without Human Knowledge](http://discovery.ucl.ac.uk/10045895/1/agz_unformatted_nature.pdf) @@ -15,6 +15,9 @@ - [AlphaGo Zero: Starting from scratch](https://deepmind.com/blog/article/alphago-zero-starting-scratch) - [AlphaGo Zero Cheatsheet: AlphaGo Zero Explained In One Diagram](https://medium.com/applied-data-science/alphago-zero-explained-in-one-diagram-365f5abf67e0) | [AlphaGo Zero Cheatsheet](https://adspassets.blob.core.windows.net/website/content/alpha_go_zero_cheat_sheet.png) - [How to build your own AlphaZero AI using Python and Keras](https://medium.com/applied-data-science/how-to-build-your-own-alphazero-ai-using-python-and-keras-7f664945c188) + - [Leela Zero: A Go program with no human provided knowledge. Using MCTS (but without Monte Carlo playouts) and a deep residual convolutional neural network stack](https://github.com/leela-zero/leela-zero) + - [Mastering the game of Go without human knowledge](https://www.nature.com/articles/nature24270) + - [Unlike Google DeepMind, Facebook a Go is opensourced](https://github.com/facebookresearch/darkforestGo) - [Reinforcement Learning with DNNs](https://biostat.wisc.edu/~craven/cs760/lectures/AlphaZero.pdf) - [TensorLayer - DL and RL library for Data Scientists](https://github.com/tensorlayer/tensorlayer) | [Docs](https://tensorlayer.readthedocs.io/en/stable/) - [Tutorial: Reinforcement Learning with TensorFlow Agents](https://towardsdatascience.com/reinforcement-learning-with-tensorflow-agents-tutorial-4ac7fa858728?source=social.tw) diff --git a/details/maths-stats-probability.md b/details/maths-stats-probability.md index ce594579..17a41748 100644 --- a/details/maths-stats-probability.md +++ b/details/maths-stats-probability.md @@ -29,6 +29,13 @@ - [The Math of Random Forests and Feature Importance in Scikit-learn and Spark](https://www.linkedin.com/posts/data-science-central_the-math-of-decision-trees-random-forest-activity-6656775689431240705-kwf_) - [Let's Learn Basic Mathematics - Sigma Notations](https://www.linkedin.com/posts/nabihbawazir_statistics-data-datascience-activity-6664146836003127296-wgLS) - [Why Study Linear Algebra?๐Ÿ˜‰](https://www.linkedin.com/posts/iamsivab_linear-algebra-in-4-pages-activity-6673551357896736768-FI05) +- [How to convert data points into an equation?](https://math.stackexchange.com/questions/811849/how-to-convert-data-points-into-an-equation) +- [AI has cracked a key mathematical puzzle for understanding our world](https://www.technologyreview.com/2020/10/30/1011435/ai-fourier-neural-network-cracks-navier-stokes-and-partial-differential-equations/) +- [How the Mathematics of Fractals Can Help Predict Stock Markets๏ฟฝShifts๏ฟฝ+](https://www.linkedin.com/posts/vincentg_how-the-mathematics-of-fractals-can-help-activity-6730167289112510464-H3eX) +- [Introduction to Linear Algebra for Applied Machine Learning with Python](https://pabloinsente.github.io/intro-linear-algebra) +- [#AI #fourier on partial differential equations and navier stokes](https://www.technologyreview.com/2020/10/30/1011435/ai-fourier-neural-network-cracks-navier-stokes-and-partial-differential-equations/?) +- [Manim is an engine for precise programatic animations, designed for creating explanatory math videos](https://github.com/3b1b/manim) +- Hessian matrix approximation: [Khan Academy](https://www.khanacademy.org/math/multivariable-calculus/applications-of-multivariable-derivatives/quadratic-approximations/a/the-hessian) | [Uni. of Buffalo | Chapter 5 Hessian](https://cedar.buffalo.edu/~srihari/CSE574/Chap5/Chap5.4-Hessian.pdf) | [Math Lectures: Hessian Example](https://www.iith.ac.in/~ashok/Maths_Lectures/TutorialB/Hessian_Examples.pdf) ## Statistics @@ -114,6 +121,15 @@ - [Capsule Networks -- A Probabilistic Perspective](https://www.linkedin.com/posts/montrealai_artificialintelligence-capsulenetworks-machinelearning-activity-6657124988568563712-WArH) - Pomegranate: [PyPi](https://pypi.org/project/pomegranate/) | [docs](https://pomegranate.readthedocs.io/en/latest/) | [GitHub](https://github.com/jmschrei/pomegranate) - [A Gentle Introduction to Probability Density Estimation](https://machinelearningmastery.com/probability-density-estimation/) +- [Introduction to Probabilistic programming](https://www.datasciencecentral.com/profiles/blogs/introduction-to-probabilistic-programming) +- [A great book for beginners to understand probability intuitively from scratch](https://scholar.harvard.edu/david-morin/probability) +- [Bayesian Methods in Machine learning](https://www.coursera.org/learn/bayesian-methods-in-machine-learning) +- [What is a Markov Chain and What is Memoryless property ?](https://www.linkedin.com/feed/update/urn:li:activity:6755013137956786176/) +- [Statistics Used in Data Science (A Dictionary in One Picture)](https://www.linkedin.com/posts/vincentg_statistics-used-in-data-science-a-dictionary-activity-6724889005109886976--Xe6) +- [Statistics: Are you Bayesian or Frequentist?](https://towardsdatascience.com/statistics-are-you-bayesian-or-frequentist-4943f953f21b) +- [A simple way to understand the statistical foundations of data](https://www.datasciencecentral.com/profiles/blogs/a-simple-way-to-understand-the-statistical-foundations-of-data) +- [Joy of Stats (documentary)](https://www.bbc.co.uk/programmes/b00wgq0l) +- Stat thinking 001: [Video](https://www.youtube.com/watch?v=OJt-k9h9pmk&feature=youtu.be) | [Post](https://www.linkedin.com/posts/steffenkonrath_stat-thinking-001-what-is-statistics-activity-6685106091707138048-UYcz) - [Data Science](../courses.md#data-science) in [Courses](../courses.md#courses) ### Bayesian @@ -154,7 +170,8 @@ - [Bayesian Stats 101 for Data Scientists](https://www.linkedin.com/posts/towards-data-science_bayesian-stats-101-for-data-scientists-activity-6655949045678387202-qlgP) - [New Marketing Insight from Unsupervised Bayesian Belief Networks](https://www.linkedin.com/posts/vincentg_new-marketing-insight-from-unsupervised-bayesian-activity-6657419179529949184-nyP4) - [Everything you need to know about Gaussian Distribution](https://deepai.org/machine-learning-glossary-and-terms/gaussian-distribution) -- Naive Bayesian meetups (Password to access the videos is: Bayes2020) +- Bayesian hyperparameter optimisation by Akinkunle: [Original Notebook](https://colab.research.google.com/drive/1akyJnd7O-lqA5I8mbR2gDG_VsjOBSp05?usp=sharing) | [Saved Notebook](https://colab.research.google.com/drive/1Nic2155ulaYDRrGUPDKNhY49j2xrZ99B) | [Slides](https://www.dropbox.com/sh/q0v0k3ida37thyn/AAB6wXMge7C6fvqKIZmGFXVQa?dl=0) +- [Naive Bayesian meetups](https://www.meetup.com/The-Naive-Bayesians/) (Password to access the videos is: Bayes2020) - [Meetup 1](https://vimeo.com/442020274) - [Meetup 2](https://vimeo.com/442024578) - [Meetup 3](https://vimeo.com/445354913) @@ -164,6 +181,7 @@ - [Meetup 8](https://vimeo.com/450726667) - [Meetup 9](https://vimeo.com/453991489) - [Meetup 10](https://vimeo.com/453992440) + - See [here for more meetup links](https://www.meetup.com/The-Naive-Bayesians/boards/?pager.offset=20) and [here](https://www.meetup.com/The-Naive-Bayesians/messages/boards/) (need to be signed up on Meetup.com to access the links) - [Michael Betancourt blogs (Bayesian)](https://betanalpha.github.io/writing/) - [Bayesian Analysis](https://projecteuclid.org/euclid.ba/1516093227) - [Bayesian Data Analysis](https://www.stat.columbia.edu/~gelman/book/) @@ -178,6 +196,13 @@ + https://www.coursera.org/learn/bayesian + https://www.coursera.org/learn/bayesian-methods-in-machine-learning +## Misc + +- [Stochastic Hill Climbing in Python from Scratch](https://machinelearningmastery.com/stochastic-hill-climbing-in-python-from-scratch/) +- [How is the surrogate model/function created?](http://krasserm.github.io/2018/03/21/bayesian-optimization/) +- Understanding Quartiles and Percentiles: [Quartiles](https://www.clubbenchmarking.com/quartiles) | [Percentiles and Quartiles](https://softschools.com/math/probability_and_statistics/percentiles_and_quartiles/) + + # Contributing Contributions are very welcome, please share back with the wider community (and get credited for it)! diff --git a/details/visualisation.md b/details/visualisation.md index 78f15ed9..c244d870 100644 --- a/details/visualisation.md +++ b/details/visualisation.md @@ -90,6 +90,26 @@ data visualisation. Magic from spreadsheets. Next-level storytelling. Embed on y - [Data Visualisation by Purpose (Ellie K, Miller post)](https://www.linkedin.com/posts/nabihbawazir_datascience-machinelearning-artificialintelligence-activity-6623855200325197824-0nxv) - [๐Ÿ‘‰ Which Chart or Graph is right for you ๐Ÿ‘ˆ](https://www.linkedin.com/posts/asif-bhat_data-visualisation-activity-6623861062238334977-EP68) - [A Beginners Guide to Creating Clean and Appetizing Python Charts](https://towardsdatascience.com/a-beginners-guide-to-creating-clean-and-appetizing-python-charts-f7e1cf1899d2?source=social.tw) + - [The Rosรฉ Pine vibe for your Matplotlib charts](https://www.linkedin.com/posts/philipvollet_matplotlib-datascience-dashboard-activity-6794506962332717056-DTM4) + - [Lollipop & Dumbbell Charts with Plotly](https://towardsdatascience.com/lollipop-dumbbell-charts-with-plotly-696039d5f85?source=social.tw&gi=ce96280a359d) + - [Data is Beautiful](https://www.reddit.com/r/dataisbeautiful/) + - [Pywedge is an open-source python library which is a complete package that helps you in Visualizing the data, pre-process the data and also create some baseline models which can be further tuned to make the best machine learning model for the data](https://www.linkedin.com/posts/himanshusharmads_pywedge-a-complete-package-for-eda-data-activity-6742464668968730624-NAN1) + - [Visual Data](http://visualdata.io/) + - [Data Visualization for Scientists and Engineers (video)](https://youtu.be/pkPUICnZ3pI) + - [Printing vectors using Python](https://stackoverflow.com/questions/42281966/how-to-plot-vectors-in-python-using-matplotlib) + - [The Art of Effective Visualization of Multi-dimensional Data](https://towardsdatascience.com/the-art-of-effective-visualization-of-multi-dimensional-data-6c7202990c57) + - [Alberto Cairo's weblog about information design and visualisation](http://www.thefunctionalart.com/) + - [Jupyter notebook extension for 3D visualization - K3D lets you create 3D plots backed by WebGL with high-level API](https://www.linkedin.com/posts/philipvollet_datascience-jupyter-python-activity-6747052424302874624-3pvu) + - [Visualize All the Things](https://www.thoughtworks.com/radar?utm_source=marketo&utm_medium=email&utm_campaign=techradar-vol23&mkt_tok=eyJpIjoiTlRSallqTmxNemcyTVRVdyIsInQiOiJSU0RSVnVyZXlcL0lzc0VDZk5meWVHRXZON2FwUElTM3BwVTNKUTBKa2dMZzdLZlBxTFduaGdINXBlMmk5VzJGXC8yNGFNaXhRRG11QVVBVFRjMUFZeThHQlljMXh4UllZa3dWYmMrK2pTcXg5ZkJFa1JMSUNKS2JcL1wvUVFUNE1ReEUifQ%3D%3D#visualize-all-the-things) + - [Machine Learning Data Visualization](https://towardsdatascience.com/machine-learning-data-visualization-4c386fe3d971?source=social.tw&gi=b1c272c4589b) + - [Thinking about how to visualize text? Here is a huge collection of possibilities for inspiration!](https://www.linkedin.com/posts/philipvollet_datascience-nlp-dashboards-activity-6734202503728123904-ztkM) + - Process 120 million taxi trips and explore in real-time with Dash, Plotly and Vaex: [Interactive Dashboard](https://dash.vaex.io/) | [Blogpost](https://lnkd.in/gYWTtRX) | [Code](https://lnkd.in/gFJg2GE) | [Tutorial](for data processing https://lnkd.in/gF_EEeN) + - [Learning Data Visualization](https://www.linkedin.com/learning/learning-data-visualization-3) + - [Code to draw? Interactive Canvas on Jupyter](https://www.linkedin.com/posts/philipvollet_python-datascientist-jupyter-activity-6728596731803660288-wVXF) + - [Holoview: Stop plotting your data - annotate your data and let it visualize itself](https://www.linkedin.com/posts/philipvollet_data-datascience-plotly-activity-6740700125418659840-lbcB) + - [Draw as Diagram](https://www.linkedin.com/posts/philipvollet_python-devops-mlops-activity-6810071949092507648-B-ZY) + - [9 Distance Measures in Data Science](https://www.linkedin.com/feed/update/urn%3Ali%3Aactivity%3A6762316624679768065/) + - [Creating beautiful maps with Python](https://towardsdatascience.com/creating-beautiful-maps-with-python-6e1aae54c55c) ## Books and other resources - [Italo Calvino: text & data | data visualization book by Hanna Piotrowska (Dyrcz)](https://www.behance.net/gallery/83315693/Calvinos-book-text-data-data-visualization?fbclid=IwAR0zj9iwNSDOp2x7n8Kh-CaKaJ3vZjGHfWMIloWZklNuH_QQKzpMxnQOXUM) diff --git a/examples/ensembler/analysis/01_ComparingResults.ipynb b/examples/ensembler/analysis/01_ComparingResults.ipynb index d737347a..5e5bc954 100644 --- a/examples/ensembler/analysis/01_ComparingResults.ipynb +++ b/examples/ensembler/analysis/01_ComparingResults.ipynb @@ -353,7 +353,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.0" + "version": "3.7.2" } }, "nbformat": 4, diff --git a/examples/ensembler/analysis/02_EnsembleTribuoResults.ipynb b/examples/ensembler/analysis/02_EnsembleTribuoResults.ipynb index fad832f1..6581f84b 100644 --- a/examples/ensembler/analysis/02_EnsembleTribuoResults.ipynb +++ b/examples/ensembler/analysis/02_EnsembleTribuoResults.ipynb @@ -537,7 +537,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.0" + "version": "3.7.2" } }, "nbformat": 4, diff --git a/examples/ensembler/analysis/03_EnsembleTribuoDeepNettsResults.ipynb b/examples/ensembler/analysis/03_EnsembleTribuoDeepNettsResults.ipynb index 0ce8d1eb..0bd09f17 100644 --- a/examples/ensembler/analysis/03_EnsembleTribuoDeepNettsResults.ipynb +++ b/examples/ensembler/analysis/03_EnsembleTribuoDeepNettsResults.ipynb @@ -681,7 +681,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.0" + "version": "3.7.2" } }, "nbformat": 4, diff --git a/examples/ensembler/datasets/deepnetts-linear-regression-validation.csv b/examples/ensembler/datasets/deepnetts-linear-regression-validation.csv index f1b82215..969879ea 100644 --- a/examples/ensembler/datasets/deepnetts-linear-regression-validation.csv +++ b/examples/ensembler/datasets/deepnetts-linear-regression-validation.csv @@ -1,101 +1,101 @@ x,y --0.39489456439031734,0.08978554606437683 --0.19138011042760894,0.19795453548431396 -0.06806261510286671,0.33584967255592346 --0.42698976545655454,0.0727267861366272 -0.38085075213812025,0.5020981431007385 -0.28144200973048306,0.4492619037628174 --0.21820472420904957,0.18369710445404053 -0.14160061848869399,0.37493547797203064 -0.4809307239708793,0.5552911758422852 -0.24742224238471633,0.4311802387237549 -0.4032055120614809,0.5139797925949097 --0.3097350635750552,0.1350482702255249 --0.11674994474060707,0.23762084543704987 --0.3818836004982593,0.09670095145702362 -0.15924151064859549,0.384311705827713 -0.41825051520798795,0.5219762921333313 --0.36131591731378887,0.10763278603553772 -0.08620687542894956,0.3454934358596802 --0.003204427116996089,0.29797086119651794 -0.021653183797136877,0.31118282675743103 --0.3018599845352381,0.13923391699790955 --0.19372420374553412,0.19670863449573517 --0.09445981917666446,0.24946816265583038 --0.38611169386255295,0.09445369243621826 -0.030384925433560084,0.3158237636089325 --0.03112383051926615,0.28313156962394714 --0.11966867035846507,0.23606953024864197 -0.09321665569909698,0.3492191731929779 -0.1410835269143268,0.3746606409549713 --0.4011964255436419,0.08643609285354614 -0.37727479739495884,0.5001975297927856 -0.12172339484566674,0.3643706440925598 -0.1142311772157446,0.36038848757743835 --0.4636005118068799,0.053267985582351685 --0.44983216684802496,0.060585930943489075 --0.1950470049805978,0.19600555300712585 --0.09834057605244029,0.2474055141210556 --0.4676014601321583,0.05114147067070007 -0.08039613867691775,0.3424049913883209 --0.07004238960962117,0.2624461352825165 --0.2154661620575712,0.18515267968177795 --0.2977399064750996,0.141423761844635 -0.03348328226850028,0.31747058033943176 -0.23182125772241235,0.42288821935653687 --0.27724385508581917,0.15231750905513763 -0.24510559232384277,0.42994892597198486 --0.11064343743654836,0.24086648225784302 --0.38522330317438624,0.0949258804321289 --0.30607562712263126,0.13699327409267426 -0.3353117692054788,0.4778939485549927 --0.29898918135909325,0.14075976610183716 --0.44616341877267407,0.06253589689731598 -0.10299769865693664,0.3544178307056427 -0.44149032530535826,0.5343284010887146 -0.2376188516220673,0.425969660282135 -0.04641540661141985,0.32434406876564026 --0.08294607470371218,0.25558775663375854 -0.018013675866689893,0.3092483878135681 --0.03674592843913527,0.2801433801651001 -0.4707229969704464,0.54986572265625 -0.34652070361150855,0.4838515520095825 --0.4515734539954259,0.05966043472290039 --0.20874164748529722,0.1887267827987671 --0.4612347000921332,0.05452543497085571 --0.3449971338617792,0.11630629003047943 -0.298012643200093,0.4580692648887634 --0.26309453406149963,0.15983794629573822 -0.2065931445430328,0.4094793498516083 -0.03432119125768085,0.3179159164428711 --0.1455347064693513,0.22232159972190857 --0.13262298140600493,0.22918424010276794 --0.07700648645525998,0.2587446868419647 -0.4296196089403954,0.5280190706253052 --0.39895891250597537,0.08762532472610474 -0.3181858705907611,0.4687914252281189 --0.039876418532828084,0.27847951650619507 --0.2586098996508076,0.16222155094146729 --0.32424587710691544,0.12733569741249084 --0.3909516219717013,0.09188124537467957 --0.22435836109844398,0.18042641878128052 --0.032992668500639866,0.28213825821876526 --0.27690012833235367,0.1525001972913742 -0.04982445832095084,0.32615599036216736 -0.4056943193421755,0.5153026580810547 --0.42614401770695365,0.0731763243675232 --0.15852456642452317,0.21541741490364075 --0.3214881648637995,0.12880143523216248 --0.2183265730252778,0.18363234400749207 --0.28062248980331495,0.15052175521850586 --0.43010563678850267,0.07107070088386536 --0.08160277168897723,0.25630173087120056 --0.3990383246952893,0.08758312463760376 --0.1496461484437207,0.2201363444328308 --0.45822736731169367,0.05612385272979736 --0.019664590759437384,0.28922221064567566 --0.037060277741008174,0.2799763083457947 --0.27551129273373043,0.15323837101459503 --0.43850719401388816,0.06660522520542145 --0.031256542813213994,0.28306102752685547 -0.4063328683694918,0.5156420469284058 +0.18096967195161662,0.39586034417152405 +0.21665770593733824,0.41482871770858765 +0.3508193715073922,0.4861363172531128 +0.26109225175673356,0.43844589591026306 +-0.3202842329625111,0.12944132089614868 +-0.21283640404304704,0.18655040860176086 +-0.3667014418163642,0.10477033257484436 +0.0424259122128261,0.32222363352775574 +-0.23507315801703377,0.17473144829273224 +-0.027516615872768768,0.28504881262779236 +0.46367236148771895,0.5461182594299316 +0.4063000297578091,0.5156245827674866 +0.04098750259073891,0.3214591145515442 +-0.15914079257199742,0.21508988738059998 +0.3664274105611337,0.49443209171295166 +-0.013826419511493104,0.2923252284526825 +-0.38253833448056795,0.0963529497385025 +0.39046964378151827,0.5072106122970581 +-0.02317522534253491,0.28735628724098206 +-0.21253351247915175,0.186711385846138 +0.4912676630852052,0.5607852935791016 +-0.39154032231594726,0.09156835079193115 +0.4181020106821909,0.5218973755836487 +-0.006281457128786028,0.2963353991508484 +0.022568023180891572,0.3116690516471863 +0.27968347675404026,0.4483272433280945 +-0.3363288553515078,0.12091352045536041 +-0.21333284899039284,0.1862865388393402 +-0.4955344871811579,0.03629493713378906 +-0.21085167657616355,0.18760529160499573 +0.1124314314626047,0.3594319224357605 +0.11710304555635354,0.36191490292549133 +-0.4439108529838872,0.06373314559459686 +0.25882704861630634,0.43724194169044495 +0.2566828131789779,0.4361022710800171 +0.2569179208742186,0.4362272024154663 +0.39189619727957337,0.5079688429832458 +-0.2000631860374109,0.19333943724632263 +0.06724355168903984,0.335414320230484 +-0.29993769220901045,0.1402556151151657 +-0.06299179533136567,0.26619356870651245 +0.4459608225792733,0.5367044806480408 +-0.4407575549582542,0.06540915369987488 +-0.4388129495449946,0.06644271314144135 +0.26914141597729024,0.4427240490913391 +0.45962033523026813,0.5439645648002625 +-0.06532684809796008,0.26495248079299927 +0.04554608317447573,0.32388201355934143 +-0.027816065491570008,0.2848896384239197 +0.17868296711876874,0.39464494585990906 +-0.2758343900523553,0.15306665003299713 +-0.1677066776654922,0.21053707599639893 +-0.31528546458480045,0.13209819793701172 +0.3809236788025693,0.5021369457244873 +0.4705464919190241,0.5497719049453735 +0.3256106952923432,0.47273778915405273 +0.41874970681381285,0.5222416520118713 +0.3077343215809596,0.46323639154434204 +-0.2893720232595387,0.1458713263273239 +-0.315840735350869,0.13180308043956757 +0.10974836641071917,0.3580058515071869 +-0.011192933557502949,0.293724924325943 +0.31928107756388147,0.46937355399131775 +0.33915043173568804,0.4799342155456543 +0.3383949486068696,0.4795326590538025 +0.3569972217434977,0.4894198775291443 +-0.31878271237666544,0.13023939728736877 +0.35488796152446267,0.48829880356788635 +0.23792488423327496,0.42613232135772705 +-0.16709975131594146,0.21085965633392334 +-0.3489807838636392,0.11418896913528442 +0.47293725133881714,0.5510425567626953 +0.26774759779952517,0.44198325276374817 +-0.4071427266557678,0.08327560126781464 +-0.14917173696761288,0.2203885018825531 +0.012132485110180724,0.3061225116252899 +-0.20513477247209255,0.19064384698867798 +-0.387965991689194,0.09346812963485718 +-0.46228573064645206,0.05396680533885956 +-0.028563078722406043,0.2844926118850708 +0.27907654660527126,0.44800466299057007 +-0.20410354249049256,0.19119195640087128 +0.04911979106205655,0.3257814645767212 +-0.29877350387197676,0.14087440073490143 +0.3061990235240424,0.46242037415504456 +0.07883737629279586,0.3415765166282654 +-0.29511174454931133,0.14282062649726868 +0.3708712913698827,0.49679404497146606 +-0.2444675656001486,0.1697382777929306 +0.277886752951482,0.4473722577095032 +0.008252783045838763,0.3040604293346405 +0.16654482434190143,0.3881934583187103 +0.023531546861687347,0.31218117475509644 +-0.2576305250321457,0.16274209320545197 +0.04577964211842678,0.3240061402320862 +-0.2970055279567545,0.14181408286094666 +0.45514586174464655,0.5415863990783691 +0.4370526514159079,0.5319697260856628 +0.1376274009964158,0.37282371520996094 +-0.47343720806953504,0.04803973436355591 diff --git a/examples/ensembler/datasets/tribuo-linear-regression-ada-validation.csv b/examples/ensembler/datasets/tribuo-linear-regression-ada-validation.csv index 5bacff4a..4a801181 100644 --- a/examples/ensembler/datasets/tribuo-linear-regression-ada-validation.csv +++ b/examples/ensembler/datasets/tribuo-linear-regression-ada-validation.csv @@ -1,101 +1,101 @@ x,y --0.39489456439031734,0.15909128664563427 --0.19138011042760894,0.18680728323685183 -0.06806261510286671,0.22213997610659986 --0.42698976545655454,0.15472034165364235 -0.38085075213812025,0.2647376136259031 -0.28144200973048306,0.2511994480948585 --0.21820472420904957,0.1831541230329905 -0.14160061848869399,0.23215488665216732 -0.4809307239708793,0.2783671917917574 -0.24742224238471633,0.24656640244414488 -0.4032055120614809,0.26778203842114084 --0.3097350635750552,0.1706888925568984 --0.11674994474060707,0.1969709319483186 --0.3818836004982593,0.16086320910057114 -0.15924151064859549,0.2345573445486441 -0.41825051520798795,0.2698309703166605 --0.36131591731378887,0.16366425750735644 -0.08620687542894956,0.22461098613834937 --0.003204427116996089,0.21243434065530256 -0.021653183797136877,0.21581962089435705 --0.3018599845352381,0.1717613749264461 --0.19372420374553412,0.18648804850367157 --0.09445981917666446,0.20000655439241904 --0.38611169386255295,0.16028739829701227 -0.030384925433560084,0.21700876946203002 --0.03112383051926615,0.20863208447910028 --0.11966867035846507,0.1965734398405095 -0.09321665569909698,0.2255656261763891 -0.1410835269143268,0.23208446556888085 --0.4011964255436419,0.15823305589669753 -0.37727479739495884,0.2642506155403029 -0.12172339484566674,0.22944786976955336 -0.1142311772157446,0.22842752809726297 --0.4636005118068799,0.14973443868070913 --0.44983216684802496,0.15160950649292584 --0.1950470049805978,0.18630790034314534 --0.09834057605244029,0.19947804625824958 --0.4676014601321583,0.1491895620491684 -0.08039613867691775,0.22381964008466176 --0.07004238960962117,0.20333188771135113 --0.2154661620575712,0.1835270792421309 --0.2977399064750996,0.1723224754637636 -0.03348328226850028,0.21743072498314162 -0.23182125772241235,0.2444417531659207 --0.27724385508581917,0.1751137685605321 -0.24510559232384277,0.24625090512225942 --0.11064343743654836,0.19780255806787936 --0.38522330317438624,0.16040838544463476 --0.30607562712263126,0.17118725975522892 -0.3353117692054788,0.2585358020535423 --0.29898918135909325,0.1721523406268905 --0.44616341877267407,0.15210914181204224 -0.10299769865693664,0.22689767580726056 -0.44149032530535826,0.2729959273253518 -0.2376188516220673,0.2452313093356734 -0.04641540661141985,0.2191919105266079 --0.08294607470371218,0.2015745752201828 -0.018013675866689893,0.21532396769906872 --0.03674592843913527,0.20786642855762388 -0.4707229969704464,0.2769770333964234 -0.34652070361150855,0.26006231175218825 --0.4515734539954259,0.15137236604564075 --0.20874164748529722,0.1844428698375099 --0.4612347000921332,0.15005663117438528 --0.3449971338617792,0.16588666155641524 -0.298012643200093,0.2534561508096553 --0.26309453406149963,0.17704072031084828 -0.2065931445430328,0.2410060153828409 -0.03432119125768085,0.21754483718616055 --0.1455347064693513,0.19305082532975346 --0.13262298140600493,0.19480923275917378 --0.07700648645525998,0.20238346915593 -0.4296196089403954,0.2713792916127507 --0.39895891250597537,0.15853777579504996 -0.3181858705907611,0.2562034795163233 --0.039876418532828084,0.20744009690856308 --0.2586098996508076,0.17765146863667663 --0.32424587710691544,0.168712710273263 --0.3909516219717013,0.15962826363427488 --0.22435836109844398,0.18231607848269385 --0.032992668500639866,0.20837757328290393 --0.27690012833235367,0.1751605796314013 -0.04982445832095084,0.21965617861037734 -0.4056943193421755,0.26812098129603157 --0.42614401770695365,0.1548355213929117 --0.15852456642452317,0.19128177695393683 --0.3214881648637995,0.16908827447345376 --0.2183265730252778,0.18313752882403153 --0.28062248980331495,0.17465364287171864 --0.43010563678850267,0.15429600088799447 --0.08160277168897723,0.2017575154539226 --0.3990383246952893,0.158526960897506 --0.1496461484437207,0.19249090091393042 --0.45822736731169367,0.1504661904143722 --0.019664590759437384,0.21019268248055503 --0.037060277741008174,0.20782361830995097 --0.27551129273373043,0.17534972080524766 --0.43850719401388816,0.15315181910195597 --0.031256542813213994,0.2086140108071077 -0.4063328683694918,0.26820794328973674 +0.18096967195161662,0.23751643483870433 +0.21665770593733824,0.24237667650231 +0.3508193715073922,0.2606477338803429 +0.26109225175673356,0.24842807824052918 +-0.3202842329625111,0.16925223419141208 +-0.21283640404304704,0.18388521775635963 +-0.3667014418163642,0.16293081977923002 +0.0424259122128261,0.21864859376949755 +-0.23507315801703377,0.18085686382207009 +-0.027516615872768768,0.20912333975310862 +0.46367236148771895,0.2760168294148379 +0.4063000297578091,0.2682034711019794 +0.04098750259073891,0.21845270126453645 +-0.15914079257199742,0.19119785504337836 +0.3664274105611337,0.26277334387406037 +-0.013826419511493104,0.21098776475238532 +-0.38253833448056795,0.16077404292849598 +0.39046964378151827,0.26604758037670967 +-0.02317522534253491,0.20971458014313593 +-0.21253351247915175,0.18392646761055412 +0.4912676630852052,0.2797749471805161 +-0.39154032231594726,0.15954809037671666 +0.4181020106821909,0.2698107459500338 +-0.006281457128786028,0.21201528956749396 +0.022568023180891572,0.21594421000706107 +0.27968347675404026,0.2509599589920867 +-0.3363288553515078,0.16706716727954404 +-0.21333284899039284,0.18381760847259407 +-0.4955344871811579,0.14538545051115292 +-0.21085167657616355,0.18415551157891397 +0.1124314314626047,0.2281824263554072 +0.11710304555635354,0.2288186388591223 +-0.4439108529838872,0.15241591169746252 +0.25882704861630634,0.24811958731373732 +0.2566828131789779,0.24782757059986474 +0.2569179208742186,0.24785958918111525 +0.39189619727957337,0.2662418582332423 +-0.2000631860374109,0.18562476234304517 +0.06724355168903984,0.22202843042351153 +-0.29993769220901045,0.17202316590258393 +-0.06299179533136567,0.20429208608142854 +0.4459608225792733,0.2736047503587643 +-0.4407575549582542,0.15284534948739736 +-0.4388129495449946,0.15311017921302972 +0.26914141597729024,0.24952426872604613 +0.45962033523026813,0.2754649966395618 +-0.06532684809796008,0.20397408255264182 +0.04554608317447573,0.2190735200882589 +-0.027816065491570008,0.20908255864664502 +0.17868296711876874,0.23720501566371074 +-0.2758343900523553,0.17530571919255 +-0.1677066776654922,0.19003129395924245 +-0.31528546458480045,0.16993300081323787 +0.3809236788025693,0.264747545280112 +0.4705464919190241,0.2769529957258388 +0.3256106952923432,0.25721464315660775 +0.41874970681381285,0.2698989536592487 +0.3077343215809596,0.2547801157645689 +-0.2893720232595387,0.17346207129178617 +-0.315840735350869,0.16985738022531452 +0.10974836641071917,0.2278170281225834 +-0.011192933557502949,0.21134641096302342 +0.31928107756388147,0.25635263232656663 +0.33915043173568804,0.2590585774905224 +0.3383949486068696,0.25895569060747625 +0.3569972217434977,0.26148907597055304 +-0.31878271237666544,0.16945672158102307 +0.35488796152446267,0.26120182242220563 +0.23792488423327496,0.24527298695927147 +-0.16709975131594146,0.19011394935942524 +-0.3489807838636392,0.16534414072940615 +0.47293725133881714,0.2772785857693919 +0.26774759779952517,0.24933444899033935 +-0.4071427266557678,0.15742324775705438 +-0.14917173696761288,0.19255550952820139 +0.012132485110180724,0.2145230267349956 +-0.20513477247209255,0.18493407885792298 +-0.387965991689194,0.1600348672789859 +-0.46228573064645206,0.14991349461236936 +-0.028563078722406043,0.20898082525249878 +0.27907654660527126,0.25087730307450024 +-0.20410354249049256,0.18507451884190576 +0.04911979106205655,0.21956021218162905 +-0.29877350387197676,0.172181713068903 +0.3061990235240424,0.2545710283268785 +0.07883737629279586,0.22360735711368046 +-0.29511174454931133,0.17268039661166235 +0.3708712913698827,0.2633785420943776 +-0.2444675656001486,0.17957746885291556 +0.277886752951482,0.2507152688002235 +0.008252783045838763,0.2139946622522943 +0.16654482434190143,0.23555195998497744 +0.023531546861687347,0.2160754292818227 +-0.2576305250321457,0.1777848466010611 +0.04577964211842678,0.21910532774991803 +-0.2970055279567545,0.17242248817595576 +0.45514586174464655,0.2748556320983204 +0.4370526514159079,0.2723915744059494 +0.1376274009964158,0.23161378659598986 +-0.47343720806953504,0.14839480980042064 diff --git a/examples/ensembler/datasets/tribuo-linear-regression-cart-validation.csv b/examples/ensembler/datasets/tribuo-linear-regression-cart-validation.csv index 58b8842f..f94f37f2 100644 --- a/examples/ensembler/datasets/tribuo-linear-regression-cart-validation.csv +++ b/examples/ensembler/datasets/tribuo-linear-regression-cart-validation.csv @@ -1,101 +1,101 @@ x,y --0.39489456439031734,0.10694456659257412 --0.19138011042760894,0.1033502146601677 -0.06806261510286671,0.3710851788520813 --0.42698976545655454,0.03127457806840539 -0.38085075213812025,0.5557625740766525 -0.28144200973048306,0.4846498370170593 --0.21820472420904957,0.1033502146601677 -0.14160061848869399,0.3077235817909241 -0.4809307239708793,0.5557625740766525 -0.24742224238471633,0.4846498370170593 -0.4032055120614809,0.5557625740766525 --0.3097350635750552,0.09027779599030812 --0.11674994474060707,0.17751955489317578 --0.3818836004982593,0.10694456659257412 -0.15924151064859549,0.3077235817909241 -0.41825051520798795,0.5557625740766525 --0.36131591731378887,0.10694456659257412 -0.08620687542894956,0.3710851788520813 --0.003204427116996089,0.26606913208961486 -0.021653183797136877,0.26606913208961486 --0.3018599845352381,0.21655800938606262 --0.19372420374553412,0.1033502146601677 --0.09445981917666446,0.23171669244766235 --0.38611169386255295,0.10694456659257412 -0.030384925433560084,0.26606913208961486 --0.03112383051926615,0.26606913208961486 --0.11966867035846507,0.17751955489317578 -0.09321665569909698,0.3077235817909241 -0.1410835269143268,0.3077235817909241 --0.4011964255436419,0.03127457806840539 -0.37727479739495884,0.5557625740766525 -0.12172339484566674,0.3077235817909241 -0.1142311772157446,0.3077235817909241 --0.4636005118068799,0.0681862011551857 --0.44983216684802496,0.0681862011551857 --0.1950470049805978,0.1033502146601677 --0.09834057605244029,0.23171669244766235 --0.4676014601321583,0.0681862011551857 -0.08039613867691775,0.3710851788520813 --0.07004238960962117,0.33727097511291504 --0.2154661620575712,0.1033502146601677 --0.2977399064750996,0.21655800938606262 -0.03348328226850028,0.26606913208961486 -0.23182125772241235,0.4846498370170593 --0.27724385508581917,0.21655800938606262 -0.24510559232384277,0.4846498370170593 --0.11064343743654836,0.17751955489317578 --0.38522330317438624,0.10694456659257412 --0.30607562712263126,0.09027779599030812 -0.3353117692054788,0.4846498370170593 --0.29898918135909325,0.21655800938606262 --0.44616341877267407,0.0681862011551857 -0.10299769865693664,0.3077235817909241 -0.44149032530535826,0.5557625740766525 -0.2376188516220673,0.4846498370170593 -0.04641540661141985,0.3710851788520813 --0.08294607470371218,0.23171669244766235 -0.018013675866689893,0.26606913208961486 --0.03674592843913527,0.26606913208961486 -0.4707229969704464,0.5557625740766525 -0.34652070361150855,0.4846498370170593 --0.4515734539954259,0.0681862011551857 --0.20874164748529722,0.1033502146601677 --0.4612347000921332,0.0681862011551857 --0.3449971338617792,0.13953432192405066 -0.298012643200093,0.4846498370170593 --0.26309453406149963,0.21655800938606262 -0.2065931445430328,0.4846498370170593 -0.03432119125768085,0.26606913208961486 --0.1455347064693513,0.17751955489317578 --0.13262298140600493,0.17751955489317578 --0.07700648645525998,0.23171669244766235 -0.4296196089403954,0.5557625740766525 --0.39895891250597537,0.03127457806840539 -0.3181858705907611,0.4846498370170593 --0.039876418532828084,0.26606913208961486 --0.2586098996508076,0.1033502146601677 --0.32424587710691544,0.09027779599030812 --0.3909516219717013,0.10694456659257412 --0.22435836109844398,0.1033502146601677 --0.032992668500639866,0.26606913208961486 --0.27690012833235367,0.21655800938606262 -0.04982445832095084,0.3710851788520813 -0.4056943193421755,0.5557625740766525 --0.42614401770695365,0.03127457806840539 --0.15852456642452317,0.23532803356647491 --0.3214881648637995,0.09027779599030812 --0.2183265730252778,0.1033502146601677 --0.28062248980331495,0.21655800938606262 --0.43010563678850267,0.03127457806840539 --0.08160277168897723,0.23171669244766235 --0.3990383246952893,0.03127457806840539 --0.1496461484437207,0.23532803356647491 --0.45822736731169367,0.0681862011551857 --0.019664590759437384,0.26606913208961486 --0.037060277741008174,0.26606913208961486 --0.27551129273373043,0.21655800938606262 --0.43850719401388816,0.03127457806840539 --0.031256542813213994,0.26606913208961486 -0.4063328683694918,0.5557625740766525 +0.18096967195161662,0.3077235817909241 +0.21665770593733824,0.4846498370170593 +0.3508193715073922,0.4846498370170593 +0.26109225175673356,0.4846498370170593 +-0.3202842329625111,0.09027779599030812 +-0.21283640404304704,0.1033502146601677 +-0.3667014418163642,0.10694456659257412 +0.0424259122128261,0.3710851788520813 +-0.23507315801703377,0.1033502146601677 +-0.027516615872768768,0.26606913208961486 +0.46367236148771895,0.5557625740766525 +0.4063000297578091,0.5557625740766525 +0.04098750259073891,0.3710851788520813 +-0.15914079257199742,0.23532803356647491 +0.3664274105611337,0.4846498370170593 +-0.013826419511493104,0.26606913208961486 +-0.38253833448056795,0.10694456659257412 +0.39046964378151827,0.5557625740766525 +-0.02317522534253491,0.26606913208961486 +-0.21253351247915175,0.1033502146601677 +0.4912676630852052,0.5557625740766525 +-0.39154032231594726,0.10694456659257412 +0.4181020106821909,0.5557625740766525 +-0.006281457128786028,0.26606913208961486 +0.022568023180891572,0.26606913208961486 +0.27968347675404026,0.4846498370170593 +-0.3363288553515078,0.13953432192405066 +-0.21333284899039284,0.1033502146601677 +-0.4955344871811579,0.0681862011551857 +-0.21085167657616355,0.1033502146601677 +0.1124314314626047,0.3077235817909241 +0.11710304555635354,0.3077235817909241 +-0.4439108529838872,0.03127457806840539 +0.25882704861630634,0.4846498370170593 +0.2566828131789779,0.4846498370170593 +0.2569179208742186,0.4846498370170593 +0.39189619727957337,0.5557625740766525 +-0.2000631860374109,0.1033502146601677 +0.06724355168903984,0.3710851788520813 +-0.29993769220901045,0.21655800938606262 +-0.06299179533136567,0.33727097511291504 +0.4459608225792733,0.5557625740766525 +-0.4407575549582542,0.03127457806840539 +-0.4388129495449946,0.03127457806840539 +0.26914141597729024,0.4846498370170593 +0.45962033523026813,0.5557625740766525 +-0.06532684809796008,0.33727097511291504 +0.04554608317447573,0.3710851788520813 +-0.027816065491570008,0.26606913208961486 +0.17868296711876874,0.3077235817909241 +-0.2758343900523553,0.21655800938606262 +-0.1677066776654922,0.23532803356647491 +-0.31528546458480045,0.09027779599030812 +0.3809236788025693,0.5557625740766525 +0.4705464919190241,0.5557625740766525 +0.3256106952923432,0.4846498370170593 +0.41874970681381285,0.5557625740766525 +0.3077343215809596,0.4846498370170593 +-0.2893720232595387,0.21655800938606262 +-0.315840735350869,0.09027779599030812 +0.10974836641071917,0.3077235817909241 +-0.011192933557502949,0.26606913208961486 +0.31928107756388147,0.4846498370170593 +0.33915043173568804,0.4846498370170593 +0.3383949486068696,0.4846498370170593 +0.3569972217434977,0.4846498370170593 +-0.31878271237666544,0.09027779599030812 +0.35488796152446267,0.4846498370170593 +0.23792488423327496,0.4846498370170593 +-0.16709975131594146,0.23532803356647491 +-0.3489807838636392,0.13953432192405066 +0.47293725133881714,0.5557625740766525 +0.26774759779952517,0.4846498370170593 +-0.4071427266557678,0.03127457806840539 +-0.14917173696761288,0.23532803356647491 +0.012132485110180724,0.26606913208961486 +-0.20513477247209255,0.1033502146601677 +-0.387965991689194,0.10694456659257412 +-0.46228573064645206,0.0681862011551857 +-0.028563078722406043,0.26606913208961486 +0.27907654660527126,0.4846498370170593 +-0.20410354249049256,0.1033502146601677 +0.04911979106205655,0.3710851788520813 +-0.29877350387197676,0.21655800938606262 +0.3061990235240424,0.4846498370170593 +0.07883737629279586,0.3710851788520813 +-0.29511174454931133,0.21655800938606262 +0.3708712913698827,0.5557625740766525 +-0.2444675656001486,0.1033502146601677 +0.277886752951482,0.4846498370170593 +0.008252783045838763,0.26606913208961486 +0.16654482434190143,0.3077235817909241 +0.023531546861687347,0.26606913208961486 +-0.2576305250321457,0.1033502146601677 +0.04577964211842678,0.3710851788520813 +-0.2970055279567545,0.21655800938606262 +0.45514586174464655,0.5557625740766525 +0.4370526514159079,0.5557625740766525 +0.1376274009964158,0.3077235817909241 +-0.47343720806953504,0.0681862011551857 diff --git a/examples/ensembler/datasets/tribuo-linear-regression-sgd-validation.csv b/examples/ensembler/datasets/tribuo-linear-regression-sgd-validation.csv index 4a5b9bbb..f827f421 100644 --- a/examples/ensembler/datasets/tribuo-linear-regression-sgd-validation.csv +++ b/examples/ensembler/datasets/tribuo-linear-regression-sgd-validation.csv @@ -1,101 +1,101 @@ x,y --0.39489456439031734,0.01535145291052381 --0.19138011042760894,0.015534154908133278 -0.06806261510286671,0.0157670656586805 --0.42698976545655454,0.0153226399331932 -0.38085075213812025,0.016047866438930518 -0.28144200973048306,0.01595862375633929 --0.21820472420904957,0.015510073520165278 -0.14160061848869399,0.015833083279823343 -0.4809307239708793,0.016137711707502424 -0.24742224238471633,0.01592808302897936 -0.4032055120614809,0.016067935083759186 --0.3097350635750552,0.015427903553765221 --0.11674994474060707,0.015601153001327382 --0.3818836004982593,0.015363133304950927 -0.15924151064859549,0.015848920121750147 -0.41825051520798795,0.016081441505909483 --0.36131591731378887,0.015381597628888773 -0.08620687542894956,0.015783354391702524 --0.003204427116996089,0.01570308675829403 -0.021653183797136877,0.01572540229943315 --0.3018599845352381,0.015434973285884447 --0.19372420374553412,0.015532050534102693 --0.09445981917666446,0.015621163621645506 --0.38611169386255295,0.015359337598608235 -0.030384925433560084,0.01573324108733505 --0.03112383051926615,0.01567802253963861 --0.11966867035846507,0.015598532759912223 -0.09321665569909698,0.015789647315046406 -0.1410835269143268,0.015832619068747694 --0.4011964255436419,0.015345795510771846 -0.37727479739495884,0.016044656180090343 -0.12172339484566674,0.01581523880540986 -0.1142311772157446,0.01580851278128303 --0.4636005118068799,0.015289773193942318 --0.44983216684802496,0.015302133515672686 --0.1950470049805978,0.015530863009465594 --0.09834057605244029,0.015617679731338773 --0.4676014601321583,0.015286181403594619 -0.08039613867691775,0.015778137891392558 --0.07004238960962117,0.0156430839966927 --0.2154661620575712,0.015512532022575846 --0.2977399064750996,0.015438672023134982 -0.03348328226850028,0.01573602258993611 -0.23182125772241235,0.015914077482901537 --0.27724385508581917,0.015457072040721026 -0.24510559232384277,0.015926003291714303 --0.11064343743654836,0.015606635025140161 --0.38522330317438624,0.015360135137801276 --0.30607562712263126,0.015431188757036854 -0.3353117692054788,0.016006984511436454 --0.29898918135909325,0.015437550505658338 --0.44616341877267407,0.015305427078311805 -0.10299769865693664,0.015798428097208585 -0.44149032530535826,0.01610230469103555 -0.2376188516220673,0.015919282184416197 -0.04641540661141985,0.015747632207360227 --0.08294607470371218,0.01563149991017054 -0.018013675866689893,0.01572213498668823 --0.03674592843913527,0.015672975386963627 -0.4707229969704464,0.016128547876247602 -0.34652070361150855,0.016017047161372023 --0.4515734539954259,0.015300570301689367 --0.20874164748529722,0.01551856885298954 --0.4612347000921332,0.015291897065332462 --0.3449971338617792,0.015396247567884186 -0.298012643200093,0.01597349978984749 --0.26309453406149963,0.015469774377904578 -0.2065931445430328,0.015891429329003656 -0.03432119125768085,0.015736774809953693 --0.1455347064693513,0.015575311920430407 --0.13262298140600493,0.01558690322471236 --0.07700648645525998,0.015636832084946532 -0.4296196089403954,0.016091647936438004 --0.39895891250597537,0.015347804203981096 -0.3181858705907611,0.01599160999711159 --0.039876418532828084,0.01567016503721953 --0.2586098996508076,0.015473800390059625 --0.32424587710691544,0.015414876692195502 --0.3909516219717013,0.015354992626953348 --0.22435836109844398,0.015504549186485953 --0.032992668500639866,0.015676344818839144 --0.27690012833235367,0.015457380616172387 -0.04982445832095084,0.0157506926315471 -0.4056943193421755,0.01607016937254307 --0.42614401770695365,0.015323399190338417 --0.15852456642452317,0.015563650471740811 --0.3214881648637995,0.015417352386308814 --0.2183265730252778,0.015509964132248607 --0.28062248980331495,0.01545403892292492 --0.43010563678850267,0.015319842707219535 --0.08160277168897723,0.015632705839967703 --0.3990383246952893,0.015347732912899112 --0.1496461484437207,0.015571620936094177 --0.45822736731169367,0.015294596852476638 --0.019664590759437384,0.01568830989738859 --0.037060277741008174,0.01567269318467145 --0.27551129273373043,0.015458627422152399 --0.43850719401388816,0.015312300337337593 --0.031256542813213994,0.015677903399200478 -0.4063328683694918,0.016070742620194974 +0.18096967195161662,0.015868426247222542 +0.21665770593733824,0.015900464635518738 +0.3508193715073922,0.016020906224924244 +0.26109225175673356,0.015940355071436183 +-0.3202842329625111,0.015418433197814179 +-0.21283640404304704,0.01551489284773164 +-0.3667014418163642,0.015376762856391604 +0.0424259122128261,0.015744050699600547 +-0.23507315801703377,0.015494930140944524 +-0.027516615872768768,0.015681260861580316 +0.46367236148771895,0.016122218275759635 +0.4063000297578091,0.0160707131398321 +0.04098750259073891,0.015742759389296933 +-0.15914079257199742,0.015563097264113882 +0.3664274105611337,0.016034918103974408 +-0.013826419511493104,0.01569355102659891 +-0.38253833448056795,0.015362545527502488 +0.39046964378151827,0.016056501652222836 +-0.02317522534253491,0.015685158278726036 +-0.21253351247915175,0.01551516476401427 +0.4912676630852052,0.016146991536962106 +-0.39154032231594726,0.01535446413019652 +0.4181020106821909,0.016081308188236026 +-0.006281457128786028,0.015700324401523236 +0.022568023180891572,0.015726223582539337 +0.27968347675404026,0.015957045060175937 +-0.3363288553515078,0.015404029382732345 +-0.21333284899039284,0.015514447171850557 +-0.4955344871811579,0.0152611049545181 +-0.21085167657616355,0.015516674606549501 +0.1124314314626047,0.01580689708697786 +0.11710304555635354,0.015811090957292075 +-0.4439108529838872,0.015307449284898831 +0.25882704861630634,0.01593832151985943 +0.2566828131789779,0.015936396565193477 +0.2569179208742186,0.01593660762954167 +0.39189619727957337,0.016057782318871894 +-0.2000631860374109,0.01552635980941466 +0.06724355168903984,0.01576633035699081 +-0.29993769220901045,0.015436698994506871 +-0.06299179533136567,0.01564941356018998 +0.4459608225792733,0.016106318011791635 +-0.4407575549582542,0.015310280110116025 +-0.4388129495449946,0.015312025849972074 +0.26914141597729024,0.015947581085871756 +0.45962033523026813,0.016118580630977226 +-0.06532684809796008,0.01564731730217649 +0.04554608317447573,0.015746851785501212 +-0.027816065491570008,0.015680992035251407 +0.17868296711876874,0.015866373392829303 +-0.2758343900523553,0.015458337366461584 +-0.1677066776654922,0.015555407371394214 +-0.31528546458480045,0.015422920765897977 +0.3809236788025693,0.01604793190773145 +0.4705464919190241,0.016128389421529252 +0.3256106952923432,0.015998275520264695 +0.41874970681381285,0.016081889647561333 +0.3077343215809596,0.015982227278365957 +-0.2893720232595387,0.015446184162688108 +-0.315840735350869,0.015422422280035182 +0.10974836641071917,0.015804488406242163 +-0.011192933557502949,0.015695915198455514 +0.31928107756388147,0.015992593202470665 +0.33915043173568804,0.016010430612186363 +0.3383949486068696,0.01600975238872799 +0.3569972217434977,0.016026452295766346 +-0.31878271237666544,0.015419781165023112 +0.35488796152446267,0.016024558739569268 +0.23792488423327496,0.015919556920526153 +-0.16709975131594146,0.015555952230269374 +-0.3489807838636392,0.01539267130684255 +0.47293725133881714,0.01613053568934117 +0.26774759779952517,0.0159463298068573 +-0.4071427266557678,0.015340457309624855 +-0.14917173696761288,0.01557204683176245 +0.012132485110180724,0.015716855237370115 +-0.20513477247209255,0.015521806870030633 +-0.387965991689194,0.015357672931010953 +-0.46228573064645206,0.015290953518679713 +-0.028563078722406043,0.01568032141551479 +0.27907654660527126,0.015956500197890088 +-0.20410354249049256,0.015522732641021168 +0.04911979106205655,0.01575006002726105 +-0.29877350387197676,0.015437744126833484 +0.3061990235240424,0.01598084898794745 +0.07883737629279586,0.015776738536232116 +-0.29511174454931133,0.015441031415426455 +0.3708712913698827,0.01603890753020477 +-0.2444675656001486,0.01548649645479412 +0.277886752951482,0.015955432078780817 +0.008252783045838763,0.015713372294004258 +0.16654482434190143,0.015855476560248515 +0.023531546861687347,0.015727088571230792 +-0.2576305250321457,0.015474679608688835 +0.04577964211842678,0.015747061459481553 +-0.2970055279567545,0.015439331300251132 +0.45514586174464655,0.016114563740637716 +0.4370526514159079,0.016098320836972334 +0.1376274009964158,0.015829516384408094 +-0.47343720806953504,0.01528094244987441 diff --git a/examples/ensembler/datasets/tribuo-linear-regression-xgb-validation.csv b/examples/ensembler/datasets/tribuo-linear-regression-xgb-validation.csv index 452862b7..538723e1 100644 --- a/examples/ensembler/datasets/tribuo-linear-regression-xgb-validation.csv +++ b/examples/ensembler/datasets/tribuo-linear-regression-xgb-validation.csv @@ -1,101 +1,101 @@ x,y --0.39489456439031734,0.11887577176094055 --0.19138011042760894,0.10619738698005676 -0.06806261510286671,0.4115918278694153 --0.42698976545655454,0.015319526195526123 -0.38085075213812025,0.5763131976127625 -0.28144200973048306,0.5210432410240173 --0.21820472420904957,0.10619738698005676 -0.14160061848869399,0.34611833095550537 -0.4809307239708793,0.5456909537315369 -0.24742224238471633,0.47114068269729614 -0.4032055120614809,0.5198261141777039 --0.3097350635750552,0.09898144006729126 --0.11674994474060707,0.1974962055683136 --0.3818836004982593,0.11887577176094055 -0.15924151064859549,0.3155059218406677 -0.41825051520798795,0.5198261141777039 --0.36131591731378887,0.06623095273971558 -0.08620687542894956,0.43163809180259705 --0.003204427116996089,0.3220214545726776 -0.021653183797136877,0.2577093243598938 --0.3018599845352381,0.21381747722625732 --0.19372420374553412,0.10619738698005676 --0.09445981917666446,0.23174995183944702 --0.38611169386255295,0.11887577176094055 -0.030384925433560084,0.2577093243598938 --0.03112383051926615,0.3220214545726776 --0.11966867035846507,0.1974962055683136 -0.09321665569909698,0.26395803689956665 -0.1410835269143268,0.34611833095550537 --0.4011964255436419,0.0491810142993927 -0.37727479739495884,0.5763131976127625 -0.12172339484566674,0.34611833095550537 -0.1142311772157446,0.26395803689956665 --0.4636005118068799,0.09759777784347534 --0.44983216684802496,0.12042304873466492 --0.1950470049805978,0.10619738698005676 --0.09834057605244029,0.23174995183944702 --0.4676014601321583,0.09759777784347534 -0.08039613867691775,0.43163809180259705 --0.07004238960962117,0.33438974618911743 --0.2154661620575712,0.10619738698005676 --0.2977399064750996,0.21381747722625732 -0.03348328226850028,0.2577093243598938 -0.23182125772241235,0.47114068269729614 --0.27724385508581917,0.21381747722625732 -0.24510559232384277,0.47114068269729614 --0.11064343743654836,0.1974962055683136 --0.38522330317438624,0.11887577176094055 --0.30607562712263126,0.09898144006729126 -0.3353117692054788,0.5210432410240173 --0.29898918135909325,0.21381747722625732 --0.44616341877267407,0.12042304873466492 -0.10299769865693664,0.26395803689956665 -0.44149032530535826,0.5456909537315369 -0.2376188516220673,0.47114068269729614 -0.04641540661141985,0.3413911759853363 --0.08294607470371218,0.23174995183944702 -0.018013675866689893,0.2577093243598938 --0.03674592843913527,0.3220214545726776 -0.4707229969704464,0.5456909537315369 -0.34652070361150855,0.47768715023994446 --0.4515734539954259,0.12042304873466492 --0.20874164748529722,0.10619738698005676 --0.4612347000921332,0.09759777784347534 --0.3449971338617792,0.15318724513053894 -0.298012643200093,0.5210432410240173 --0.26309453406149963,0.21381747722625732 -0.2065931445430328,0.47114068269729614 -0.03432119125768085,0.2577093243598938 --0.1455347064693513,0.1738252341747284 --0.13262298140600493,0.1804182231426239 --0.07700648645525998,0.23174995183944702 -0.4296196089403954,0.5456909537315369 --0.39895891250597537,0.0491810142993927 -0.3181858705907611,0.5210432410240173 --0.039876418532828084,0.3220214545726776 --0.2586098996508076,0.10619738698005676 --0.32424587710691544,0.08787998557090759 --0.3909516219717013,0.11887577176094055 --0.22435836109844398,0.10619738698005676 --0.032992668500639866,0.3220214545726776 --0.27690012833235367,0.21381747722625732 -0.04982445832095084,0.3236507773399353 -0.4056943193421755,0.5198261141777039 --0.42614401770695365,0.015319526195526123 --0.15852456642452317,0.23246848583221436 --0.3214881648637995,0.08787998557090759 --0.2183265730252778,0.10619738698005676 --0.28062248980331495,0.21381747722625732 --0.43010563678850267,0.015319526195526123 --0.08160277168897723,0.23174995183944702 --0.3990383246952893,0.0491810142993927 --0.1496461484437207,0.23246848583221436 --0.45822736731169367,0.09759777784347534 --0.019664590759437384,0.3220214545726776 --0.037060277741008174,0.3220214545726776 --0.27551129273373043,0.21381747722625732 --0.43850719401388816,0.015319526195526123 --0.031256542813213994,0.3220214545726776 -0.4063328683694918,0.5198261141777039 +0.18096967195161662,0.3525531589984894 +0.21665770593733824,0.47114068269729614 +0.3508193715073922,0.47768715023994446 +0.26109225175673356,0.47114068269729614 +-0.3202842329625111,0.08787998557090759 +-0.21283640404304704,0.10619738698005676 +-0.3667014418163642,0.06623095273971558 +0.0424259122128261,0.3413911759853363 +-0.23507315801703377,0.10619738698005676 +-0.027516615872768768,0.3220214545726776 +0.46367236148771895,0.5456909537315369 +0.4063000297578091,0.5198261141777039 +0.04098750259073891,0.3413911759853363 +-0.15914079257199742,0.23246848583221436 +0.3664274105611337,0.47768715023994446 +-0.013826419511493104,0.3220214545726776 +-0.38253833448056795,0.11887577176094055 +0.39046964378151827,0.5763131976127625 +-0.02317522534253491,0.3220214545726776 +-0.21253351247915175,0.10619738698005676 +0.4912676630852052,0.5456909537315369 +-0.39154032231594726,0.11887577176094055 +0.4181020106821909,0.5198261141777039 +-0.006281457128786028,0.3220214545726776 +0.022568023180891572,0.2577093243598938 +0.27968347675404026,0.5210432410240173 +-0.3363288553515078,0.10771384835243225 +-0.21333284899039284,0.10619738698005676 +-0.4955344871811579,0.02776661515235901 +-0.21085167657616355,0.10619738698005676 +0.1124314314626047,0.26395803689956665 +0.11710304555635354,0.26395803689956665 +-0.4439108529838872,0.015319526195526123 +0.25882704861630634,0.47114068269729614 +0.2566828131789779,0.47114068269729614 +0.2569179208742186,0.47114068269729614 +0.39189619727957337,0.5763131976127625 +-0.2000631860374109,0.10619738698005676 +0.06724355168903984,0.4115918278694153 +-0.29993769220901045,0.21381747722625732 +-0.06299179533136567,0.33438974618911743 +0.4459608225792733,0.5456909537315369 +-0.4407575549582542,0.015319526195526123 +-0.4388129495449946,0.015319526195526123 +0.26914141597729024,0.47114068269729614 +0.45962033523026813,0.5456909537315369 +-0.06532684809796008,0.33438974618911743 +0.04554608317447573,0.3413911759853363 +-0.027816065491570008,0.3220214545726776 +0.17868296711876874,0.3525531589984894 +-0.2758343900523553,0.21381747722625732 +-0.1677066776654922,0.23246848583221436 +-0.31528546458480045,0.09898144006729126 +0.3809236788025693,0.5763131976127625 +0.4705464919190241,0.5456909537315369 +0.3256106952923432,0.5210432410240173 +0.41874970681381285,0.5198261141777039 +0.3077343215809596,0.5210432410240173 +-0.2893720232595387,0.21381747722625732 +-0.315840735350869,0.09898144006729126 +0.10974836641071917,0.26395803689956665 +-0.011192933557502949,0.3220214545726776 +0.31928107756388147,0.5210432410240173 +0.33915043173568804,0.5210432410240173 +0.3383949486068696,0.5210432410240173 +0.3569972217434977,0.47768715023994446 +-0.31878271237666544,0.08787998557090759 +0.35488796152446267,0.47768715023994446 +0.23792488423327496,0.47114068269729614 +-0.16709975131594146,0.23246848583221436 +-0.3489807838636392,0.15318724513053894 +0.47293725133881714,0.5456909537315369 +0.26774759779952517,0.47114068269729614 +-0.4071427266557678,0.0491810142993927 +-0.14917173696761288,0.23246848583221436 +0.012132485110180724,0.2577093243598938 +-0.20513477247209255,0.10619738698005676 +-0.387965991689194,0.11887577176094055 +-0.46228573064645206,0.09759777784347534 +-0.028563078722406043,0.3220214545726776 +0.27907654660527126,0.5210432410240173 +-0.20410354249049256,0.10619738698005676 +0.04911979106205655,0.3413911759853363 +-0.29877350387197676,0.21381747722625732 +0.3061990235240424,0.5210432410240173 +0.07883737629279586,0.43163809180259705 +-0.29511174454931133,0.21381747722625732 +0.3708712913698827,0.5763131976127625 +-0.2444675656001486,0.10619738698005676 +0.277886752951482,0.5210432410240173 +0.008252783045838763,0.2577093243598938 +0.16654482434190143,0.3155059218406677 +0.023531546861687347,0.2577093243598938 +-0.2576305250321457,0.10619738698005676 +0.04577964211842678,0.3413911759853363 +-0.2970055279567545,0.21381747722625732 +0.45514586174464655,0.5456909537315369 +0.4370526514159079,0.5456909537315369 +0.1376274009964158,0.34611833095550537 +-0.47343720806953504,0.02776661515235901 diff --git a/natural-language-processing/README.md b/natural-language-processing/README.md index 051c5889..ef18013d 100644 --- a/natural-language-processing/README.md +++ b/natural-language-processing/README.md @@ -16,6 +16,7 @@ NLP using DL4J (cuda): [![NLP using DL4J (cuda)](https://img.shields.io/docker/p - [Presentations](#presentations) - [Notebooks](#notebooks) - [Unstructured to structured data](#unstructured-to-structured-data) +- [Text Data Augmentation](#text-data-augmentation) - [Summarise text](#summarise-text) - [Contributing](#contributing) @@ -88,6 +89,10 @@ See [Sentiment analysis](./sentiment-analysis.md) - [Step by step guide to extract insights from free text (unstructured data)](https://www.analyticsvidhya.com/blog/2014/08/step-step-guide-extract-inforation-free-text-unstructured-data/) - [Date/Time Helpers](https://www.kaggle.com/raenish/cheatsheet-date-helpers) +## Text data Augmentation + +- Easy Data Augmentation for Text Classification: [Video](https://www.youtube.com/watch?v=3w92peJtYNQ) | [Kernel](https://www.kaggle.com/init927/nlp-data-augmentation) + ## Summarise text See [Summarise text](./summarise-text.md) diff --git a/things-to-know.md b/things-to-know.md index 57186aec..3f8f7660 100644 --- a/things-to-know.md +++ b/things-to-know.md @@ -315,6 +315,7 @@ See [Courses](./courses.md#courses) - [Auto_ml](https://github.com/ClimbsRocks/auto_ml/) - [Auto-Keras](https://github.com/keras-team/autokeras/) - [Auto-Sklearn](https://github.com/automl/auto-sklearn/) +- [PyCaret](https://pycaret.org) | [On awesome-ai-ml-dl](https://github.com/neomatrix369/awesome-ai-ml-dl/search?q=pycaret) - [Xcessiv](https://github.com/reiinakano/xcessiv/) - [MLbox](https://github.com/AxeldeRomblay/MLBox/) - [H20 Driverless AI](https://www.h2o.ai/products/h2o-driverless-ai/)