Online Graduate student at @poloclub.
-
Georgia Tech
- Austin, TX
-
20:11
(UTC -06:00) - https://poloclub.github.io/
Pinned Loading
-
outlier-detection.ipynb
outlier-detection.ipynb 1{
2"cells": [
3{
4"cell_type": "code",
5"execution_count": null,
-
anova_machine.py
anova_machine.py 1def anova_machine(Cat_col, target_col, df):
2"""ANOVA function. Provide the target variable column y, the main data set and a categorical column.
3A pivot table will be produced. Then an ANOVA performed to see if the columns are significantly different from each other.
4Currently set for 95% confidence, will update later for higher significance setting."""
5 -
outlier_isolation.py
outlier_isolation.py 1isolation_forest = IsolationForest(n_estimators=100)
2isolation_forest.fit(df['Sales'].values.reshape(-1, 1))
3xx = np.linspace(df['Sales'].min(), df['Sales'].max(), len(df)).reshape(-1,1)
4anomaly_score = isolation_forest.decision_function(xx)
5outlier = isolation_forest.predict(xx)
-
-
-
vif_multicollinearity.py
vif_multicollinearity.py 1# ------------------------------------------------------------------------------
2# Importing required libraries
3# ------------------------------------------------------------------------------
4from pyspark.sql.types import Row
5
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.