Skip to content

Latest commit

 

History

History

data

Data

Page dedicated to data exploratory analysis, preparation, cleaning, pre-processing / wrangling, generation, feature engineering and other related topics


Ethics / altruistic motives

See Ethics / altruistic motives

Datasets and sources of raw data

Data Exploratory Analysis

Data preparation

Data cleaning

Data preprocessing / Data Wrangling

Misc

Data Generation

Generate numeric data fitting a model/distribution (to fit linear model / ring / etc)

Generate random data matching a rule or type (people’s names / phone numbers / etc, financial data, etc)

Generate data from existing

Generate fake images

Generate data using GAN

Feature engineering / selection

Statistics

Visualisation

See Visualisation

Common mistakes when training models (data related)

  • Having a lot more training examples of one type of object than the other types
  • Accidentally testing the neural network using images that were in the training set
  • Training the neural network on data that is easier to recognize or more consistent than the real-world data it will be used to classify later on

Cheatsheets

See under Cheatsheets

Course / books

Best practices / rules / an unordered list of high level or low level guidelines

Framework(s) / checklist(s)

Notebooks

Programs and Tools

See Programs and Tools

Databases

References

Contributing

Contributions are very welcome, please share back with the wider community (and get credited for it)!

Please have a look at the CONTRIBUTING guidelines, also have a read about our licensing policy.


Back to main page (table of contents)