Skip to content

Commit

Permalink
Adding new links to various categories and sections: NLP, Things to k…
Browse files Browse the repository at this point in the history
…now, Visualisation, competitions, Python programming related and many others, more updates to follow.
  • Loading branch information
neomatrix369 committed Jul 15, 2021
1 parent 3b16870 commit 1866218
Show file tree
Hide file tree
Showing 26 changed files with 668 additions and 514 deletions.
16 changes: 14 additions & 2 deletions Programming-in-Python.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
- [Focussed packages](#focussed-packages)
- [Python Wrappers](#python-wrappers)
- [Cookie cutter: Python project templates](#cookie-cutter-python-project-templates)
- [Frameworks](#frameworks)
- [Libraries and Frameworks](#libraries-and-frameworks)
- [Best practices](#best-practices)
- [Testing](#testing)
- [Refactoring](#refactoring)
Expand Down Expand Up @@ -59,6 +59,7 @@
- [5 free books for learning Python for DS](https://towardsdatascience.com/5-free-books-for-learning-python-for-data-science-87be443c084)
- [7 advanced tricks in pandas for data science](https://www.linkedin.com/posts/towards-data-science_7-advanced-tricks-in-pandas-for-data-science-activity-6655303741224423424-SJtU)
- [Sqlite saving numpy serialised into the database](https://github.com/ebmdatalab/openprescribing/tree/master/openprescribing/matrixstore)
- [Beyond the Basic Stuff with Python 2020 PDF Course! Free!](https://sites.google.com/view/beyond-the-basic-stuff-with-py/home) | [Python Books](https://theappsblaster.com/?cat=306)

## Courses

Expand All @@ -72,6 +73,7 @@ See **Python: Best practices** and **Python: Testing** under [Courses](./courses
- [Python for Data Science](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/PythonForDataScience.pdf)
- [30 seconds of python](https://github.com/30-seconds/30-seconds-of-python)
- [Comprehensive Python cheatsheet](https://www.linkedin.com/posts/ashishpatel2604_comprehensive-python-cheatsheet-activity-6685556002110152704-mInG)
- [Regex symbols](https://regex101.com/)

## Database

Expand Down Expand Up @@ -121,7 +123,7 @@ See **Python: Best practices** and **Python: Testing** under [Courses](./courses
- [For Reproducible Data Science projects](https://cookiecutter.readthedocs.io/en/latest/readme.html#reproducible-science)
- [For Data Driven Journalism projects](https://cookiecutter.readthedocs.io/en/latest/readme.html#data-driven-journalism)

## Frameworks
## Libraries and Frameworks

- [Rich is a Python library for writing rich text with color and style to the terminal and for displaying advanced content such as tables, markdown, and syntax highlighted code!](https://www.linkedin.com/feed/update/urn:li:activity:6695712483468017664/)
- [Python for MicroControllers](https://micropython.org/)
Expand Down Expand Up @@ -156,6 +158,10 @@ with nothing but Python
- [🗽 𝙂𝙧𝙖𝙙𝙞𝙤 𝙥𝙮𝙩𝙝𝙤𝙣 𝙡𝙞𝙗𝙧𝙖𝙧𝙮 : 𝙃𝙖𝙨𝙨𝙡𝙚-𝙁𝙧𝙚𝙚 𝙎𝙝𝙖𝙧𝙞𝙣𝙜 𝙖𝙣𝙙 𝙏𝙚𝙨𝙩𝙞𝙣𝙜 𝙤𝙛 𝙈𝙇 𝙈𝙤𝙙𝙚𝙡𝙨 𝙞𝙣 𝙩𝙝𝙚 𝙒𝙞𝙡𝙙](https://www.linkedin.com/posts/ashishpatel2604_machinelearning-gui-python-activity-6691757766748504064-sPCX)
- [The Python scientific stack, compiled to WebAssembly.](https://alpha.iodide.io/) [GitHub](https://github.com/iodide-project/pyodide)
- [A simple video that explains in a very simple way how you can use joblib to speed up almost any function](https://www.youtube.com/watch?v=Ny3O4VpACkc)
- [pyforest: feel the bliss of automated imports](https://pypi.org/project/pyforest/)
- [How to be Pythonic? Design a Query Language in Python](https://dev.to/terminusdb/extending-prolog-terminusdb-discussion-10-3p90)
- [prython](http://www.prython.com) - a novel IDE for Python and R also both together in one workflow! It allows you to put your code inside panels that you can connect and run. Its like Jupyter Notebook but with the possibility of multiple streams
- [Syntax Trees and Python - Automated Code Transformations - PyCon 2019](https://www.youtube.com/watch?v=viNzD1zD-Fg)

## Best practices

Expand All @@ -177,6 +183,9 @@ with nothing but Python
- [Code Craft : Part III – Unit Tests are an Early Warning System for Programmers](https://codemanship.wordpress.com/2019/10/04/code-craft-part-iii-unit-tests-are-an-early-warning-system-for-programmers/)
- ["Stop writing classes"](https://www.youtube.com/watch?v=o9pEzgHorH0
)
- [How to package Python apps with BeeWare Briefcase](https://www.infoworld.com/article/3570295/how-to-package-python-apps-with-beeware-briefcase.html?utm_medium=email&utm_source=topic+optin&utm_campaign=awareness&utm_content=20200815+prog+nl&mkt_tok=eyJpIjoiWkRjd09XTmhZV0ppTnpBeSIsInQiOiJNTkNUTmpJZlB0REdcL2E0b3VBVlZKTlhHTCtuckZEQ25rREpIc3VtakFmdFB6UlZhUFIrMnNlaERrOXpmWFAzNUpYWVBXNEZXZWVaRmtjTDRURFY5ZlJWNHF0N2YwR01hUmlYaFQwd052a2pycjRZaWdReG16OEVYRmRZbTVOOGkifQ%3D%3D)
- [Teaching Clean Code](https://ceur-ws.org/Vol-2066/isee2018paper06.pdf)
- [Code Process Metrics in University Programming Education](https://ceur-ws.org/Vol-2308/isee2019paper05.pdf (paper with Adam Thornhill)
- [Remote Mob Programming www.remotemobprogramming.org (also on Amazon and Leanpub)](https://java.by-comparison.com/favor-constructor-over-field-injection.html)

## Versioning

Expand Down Expand Up @@ -208,6 +217,9 @@ See [Machine Learning Testing](./details/julia-python-and-r.md#testing)
- [Learning Python with PyCharm: Refactoring](https://www.lynda.com/Python-tutorials/Refactoring/590828/629432-4.html)
- [What refactoring tools do you use for Python?](https://stackoverflow.com/questions/28796/what-refactoring-tools-do-you-use-for-python)
- [Bowler: Safe code refactoring for modern Python projects](https://github.com/facebookincubator/Bowler) - Bowler is a refactoring tool for manipulating Python at the syntax tree level. It enables safe, large scale code modifications while guaranteeing that the resulting code compiles and runs.
- [Beautiful Python Refactoring](https://www.youtube.com/watch?v=KTIl1MugsSY)
- [Transforming Code into Beautiful, Idiomatic Python](https://www.youtube.com/watch?v=OSGv2VnC0go)
- [Professional Code Refactor! (Cleaning Python Code & Rewriting it to use Classes)](https://www.youtube.com/watch?v=731LoaZCUjo)

## Performance

Expand Down
2 changes: 1 addition & 1 deletion Python-Performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@
- [perfplot](https://awesomeopensource.com/project/nschloe/perfplot?categoryPage=26) | [github](https://github.com/nschloe/perfplot)
- [Opytimizer • A Nature-Inspired Python Optimizer. Did you ever reach a bottleneck in your computational experiments? ](https://www.linkedin.com/posts/philipvollet_python-python3-tensorflow-activity-6693021973813055488-5Z29)
- [How the CPython compiler works](https://news.ycombinator.com/item?id=24565499)
- High Performance Python talk by [Ian Oszvald](https://twitter.com/ianozsvald/): Blogs: [1](https://ianozsvald.com/2019/11/16/higher-performance-python-at-pydatacambridge-2019/) o [2](https://ianozsvald.com/2019/11/22/higher-performance-python-odsc-2019/) | [Slides](https://speakerdeck.com/ianozsvald/higher-performance-python-odsc-2019) | [Useful resources shared](https://twitter.com/DataChaz/status/1197608275606413312)
- High Performance Python talk by [Ian Oszvald](https://twitter.com/ianozsvald/): Blogs: [1](https://ianozsvald.com/2019/11/16/higher-performance-python-at-pydatacambridge-2019/) o [2](https://ianozsvald.com/2019/11/22/higher-performance-python-odsc-2019/) | [Slides](https://speakerdeck.com/ianozsvald/higher-performance-python-odsc-2019) | [Useful resources shared](https://twitter.com/DataChaz/status/1197608275606413312) | [Python Performance 2nd Edition git repo](https://github.com/mynameisfiber/high_performance_python_2e)
- [Making Pandas Fly (EuroPython 2020)](https://speakerdeck.com/ianozsvald/making-pandas-fly-europython-2020) | [Blog](https://ianozsvald.com/2020/07/24/making-pandas-fly-at-europython-2020/)
- [Making Pandas Fly (PyDataAmsterdam 2020)](https://speakerdeck.com/ianozsvald/making-pandas-fly-pydataamsterdam-2020) | [Blog](https://ianozsvald.com/2020/06/23/making-pandas-fly-for-pydataamsterdam-2020/)
- [Making Pandas Fly (PyDataUK 2020)](https://speakerdeck.com/ianozsvald/pydatauk-making-pandas-fly) | [Blog](https://ianozsvald.com/2020/04/27/flying-pandas-and-making-pandas-fly-virtual-talks-this-weekend-on-faster-data-processing-with-pandas-modin-dask-and-vaex/)
Expand Down
2 changes: 2 additions & 0 deletions cloud-devops-infra/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ reproducible research
- [Workshop: Large Scale Deep Learning Recommender](https://bit.ly/RE_streaming)
- [Reality Engines Demo](https://github.com/jsutch/RealityEngines-Demo)
- [Accelerating AI Training with MLPerf Containers and Models from NVIDIA NGC](https://developer.nvidia.com/blog/accelerating-ai-training-with-mlperf-containers-and-models-from-ngc/?ncid=so-elev-58408#cid=ngc01_so-elev_en-us&_lrsc=3642f913-311f-45b0-bbc5-158e51446637&ncid=so-lin-lt-798)
- Running AI Models in the Cloud: [site](https://www.scailable.net/) | [video](https://youtu.be/PDXaDTnAN2M?t=2570) | [Docs](https://docs.sclbl.net/sclblpy)
| [Getting started](https://github.com/scailable/sclbl-tutorials/tree/master/sclbl-101-getting-started)

## Tools

Expand Down
13 changes: 12 additions & 1 deletion competitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,22 @@
- [Hacker Rank](https://lnkd.in/gEufBUu)
- [Codeacademy](https://lnkd.in/gGQ7cuv)
- [LeetCode](https://leetcode.com/)
- [Codechef Competitive Programming: Problem statements and solutions provided by people on the codechef site](https://www.kaggle.com/arjoonn/codechef-competitive-programming)

## Resources

- [Kaggle Kernels Guide for Beginners — Step by Step Tutorial](https://towardsdatascience.com/kaggle-kernels-for-beginners-a-step-by-step-guide-3db6b1cd7606) | [Best Data Scientists on Kaggle from 2011-2020](https://www.youtube.com/watch?v=guLZ_2WcEqM) | [What Kaggle has learned from almost a million data scientists - Anthony Goldbloom (Kaggle)](https://www.youtube.com/watch?v=jmHbS8z57yI)
- [Getting ‘More’ out of your Kaggle Notebooks.](https://www.linkedin.com/posts/parulpandeyindia_getting-more-out-of-your-kaggle-notebooks-activity-6703281576970592256-rGPg)
- [Tackling any Kaggle Competition : The Noob's Way](https://www.kaggle.com/tanulsingh077/tackling-any-kaggle-competition-the-noob-s-way) | [Mr_KnowNothing-s-Weekends](https://github.com/tanulsingh/Mr_KnowNothing-s-Weekends)

- Kaggle related blogs (plus links to kernels) by https://tkravichandran.github.io/
- [Top 10% solution Detailed version on my blog](https://tkravichandran.github.io/my-fast-ds-blog/first-tabular-kaggle-competition.html)
- [Top 10% solution short version for kaggle](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220874)
- [What works in feature engineering ](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220878)
- [How to know if you are actually overfitting](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220877)
- [What can Adverserial Validation do for you?](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220876)
- Cross-posted all from [my DS blog](https://tkravichandran.github.io/my-fast-ds-blog/). Fraud Detection Competition page is [here](https://www.kaggle.com/c/ieee-fraud-detection/overview)
- https://www.kaggle.com/c/ieee-fraud-detection/discussion/111510
- https://www.kaggle.com/cdeotte/xgb-fraud-with-magic-0-9600
- RAPIDS
- [RAPIDS in Kaggle competition](https://www.kaggle.com/cdeotte/rapids/) [LinkedIn](https://www.linkedin.com/posts/miguelusque_kaggle-rapids-gpu-activity-6628421575299383297-Ifuu)
- [Here is the first ever successful implementation of NVIDIA #rapids library in a Kaggle kernel. It achieves 600X speedup of the kNN as compared to #sklearn](https://www.kaggle.com/cdeotte/rapids-gpu-knn-mnist-0-97) [LinkedIn](https://www.linkedin.com/posts/tunguz_rapids-sklearn-ml-activity-6626833143032885248-XQA6)
Expand All @@ -61,6 +70,8 @@
- [Tips N Tricks #3: Creating a clean inference kernel/notebook on Kaggle](https://www.youtube.com/watch?v=C7Tsfrq_g18)
- [Interview with Abhishek Thakur | World's First Triple Grandmaster | Kaggle](https://www.youtube.com/watch?v=8lniZVqRLA0)
- [My journey to 4x GM on Kaggle](https://www.youtube.com/watch?v=z15TKkAPNUM)
- [Grandmaster Series – How to Build a World-Class ML Model for Melanoma Detection](https://www.youtube.com/watch?v=L1QKTPb6V_I)
- [Grandmasters Series - How to Perform Large-Scale Image Classification](https://www.youtube.com/watch?v=VxNDH6qLZ_Q)
- Also see [NVIDIA's RAPIDS](./cloud-devops-infra/gpus/rapids.md#rapids)

# Contributing
Expand Down
18 changes: 15 additions & 3 deletions data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,10 @@ The question to ask ourselves: _Do we know our data...?_
+ [Data Cleaning](./data-preparation.md#data-cleaning)
+ [Data preprocessing / Data Wrangling](./data-preparation.md#data-preprocessing--data-wrangling)
- [Data Generation](./README.md#data-generation)
- [Feature Selection](./README.md#feature-selection)
- [Feature Extraction](./README.md#feature-extraction)
- [Feature Importance](./README.md#feature-importance)
- [Feature Engineering](./README.md#feature-engineering)
- [Feature Selection](./README.md#feature-selection)
- [Hyperparameter tuning](#hyperparameter-tuning)
- [Post model-creation analysis, ML interpretation/explainability](./README.md#post-model-creation-analysis-ml-interpretationexplainability)
- [Model deployment](./README.md#model-deployment)
Expand Down Expand Up @@ -65,6 +66,7 @@ See [Ethics / altruistic motives](../README-details.md#ethics--altruistic-motive
- [Data Exploration and API First Design: Deep Learning Hands-On Series with Eric Schles](https://gist.github.com/lidderupk/f6562beadd39406a033c738201f46c12)
- [Augmented Analytics Engine](https://www.linkedin.com/posts/data-science-central_augmented-analytics-engine-activity-6648764149864153088-dZWX)
- [Putting an end to Unreliable Analytics by David Yaffe](https://www.linkedin.com/posts/towards-data-science_putting-an-end-to-unreliable-analytics-activity-6717020155261587456-0hyA)
- The Fundamentals of end-to-end Data Strategy: [video](https://www.youtube.com/watch?v=hAE12zICkLI&feature=youtu.be) | [slides](https://drive.google.com/drive/folders/1LV_gP1muLbbXesJISrTqebpyrpRfTjbq?usp=sharing) | [Resources](http://nicolejaneway.com/data-strategy/resources/) | [Feedback](https://docs.google.com/forms/d/e/1FAIpQLSfZsLIIdFJSS_fRwzj_trDF_iM6-EnfPT329GfCj-tPNr_DJA/viewform)

## Datasets and sources of raw data

Expand Down Expand Up @@ -100,9 +102,10 @@ See [Data Exploratory Analysis](./data-exploratory-analysis.md)

See [Data Generation](./data-generation.md#data-generation)

## Feature Selection
## Feature Extraction

See [Feature Selection](./feature-selection.md)
- [Hierarchical Feature Extraction for Compact Representation and Classification of Datasets](doc.ml.tu-berlin.de/publications/publications/SchKoh08.pdf)
- [Guide to Feature Extraction Approaches for Text Data](https://rumankhan1.medium.com/guide-to-feature-extraction-approaches-for-text-data-1ebdcc4b9834)

## Feature Importance

Expand All @@ -116,16 +119,25 @@ See [Feature Selection](./feature-selection.md)
- [The 4 types of additive Feature Importances](https://twitter.com/TDataScience/status/1264958410405171202)
- [The Math of Random Forests and Feature Importance in Scikit-learn and Spark](https://www.linkedin.com/posts/data-science-central_the-math-of-decision-trees-random-forest-activity-6656775689431240705-kwf_)
- Path Explain - toolkit for feature attributions: [GitHub](https://github.com/suinleelab/path_explain) | [PyPI](https://pypi.org/project/path-explain/) | [Path Explain on MWML](https://madewithml.com/projects/1931/path-explain/)
- [Open Machine Learning Course: Feature Importance](https://mlcourse.ai/articles/topic5-part3-feature-importance/)

## Feature engineering

See [Feature engineering](./feature-engineering.md)

## Feature Selection

See [Feature Selection](./feature-selection.md)

## Hyperparameter tuning

- [Ray Tune Sweeps](https://docs.wandb.com/sweeps/ray-tune)
- [W&B Sweeps](https://docs.wandb.com/sweeps)
- [Automated Machine Learning Hyperparameter Tuning in Python](https://www.linkedin.com/posts/vincentg_automated-machine-learning-hyperparameter-activity-6693176296077348864-7Ihf)
- Bayesian hyperparameter optimisation by Akinkunle: [Original Notebook](https://colab.research.google.com/drive/1akyJnd7O-lqA5I8mbR2gDG_VsjOBSp05?usp=sharing) | [Saved Notebook](https://colab.research.google.com/drive/1Nic2155ulaYDRrGUPDKNhY49j2xrZ99B) | [Slides](https://www.dropbox.com/sh/q0v0k3ida37thyn/AAB6wXMge7C6fvqKIZmGFXVQa?dl=0)
- [Hyperparameter optimization for Neural Networks](http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html#id24)
- [Tune Hyperparameters Easily with W&B Sweeps](https://www.youtube.com/watch?v=9zrmUIlScdY)


## Post model-creation analysis, ML interpretation/explainability
- [Pruning: DL models](https://www.subhadityamukherjee.me/2020/09/25/Pruning.html)
Expand Down
Loading

0 comments on commit 1866218

Please sign in to comment.