Skip to content

Latest commit

 

History

History
 
 

02_Statistics

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

2_ Statistics

Statistics-101 for data noobs

1_ Pick a dataset

2_ Descriptive statistics

Mean

In probability and statistics, population mean and expected value are used synonymously to refer to one measure of the central tendency either of a probability distribution or of the random variable characterized by that distribution.

For a data set, the terms arithmetic mean, mathematical expectation, and sometimes average are used synonymously to refer to a central value of a discrete set of numbers: specifically, the sum of the values divided by the number of values.

mean_formula

Median

The median is the value separating the higher half of a data sample, a population, or a probability distribution, from the lower half. In simple terms, it may be thought of as the "middle" value of a data set.

Descriptive statistics in Python

Numpy is a python library widely used for statistical analysis.

Installation

sudo pip3 install numpy

Utilization

import numpy

3_ Exploratory data analysis

4_ Histograms

5_ Percentiles & outliers

6_ Probability theory

7_ Bayes theorem

8_ Random variables

9_ Cumul Dist Fn (CDF)

10_ Continuous distributions

11_ Skewness

12_ ANOVA

13_ Prob Den Fn (PDF)

14_ Central Limit theorem

15_ Monte Carlo method

16_ Hypothesis Testing

17_ p-Value

18_ Chi2 test

19_ Estimation

20_ Confid Int (CI)

21_ MLE

22_ Kernel Density estimate

23_ Regression

24_ Covariance

25_ Correlation

26_ Pearson coeff

27_ Causation

28_ Least2-fit

29_ Euclidian Distance