Linear Regression Project - Solutions

Overview

This project involves analyzing customer data from a New York City clothing store that sells products both online and in person. Customers often receive in-store advice and later make purchases through the mobile app or website. The company aims to determine whether to focus on improving the mobile app experience or the website.

Data Description

The dataset contains the following columns:

Avg. Session Length: Average duration of in-store style advice sessions.
Time on App: Average time spent on the app in minutes.
Time on Website: Average time spent on the website in minutes.
Length of Membership: Number of years the customer has been a member.
Yearly Amount Spent: Total dollars spent by the customer annually.

Data Import and Initial Inspection

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

customers = pd.read_csv("Ecommerce Customers.csv")

customers.head()
customers.info()
customers.describe()

Exploratory Data Analysis

We will focus on the numerical data to understand the relationships between different variables.

Jointplots

sns.set_palette("GnBu_d")
sns.set_style('whitegrid')
sns.jointplot(x='Time on Website', y='Yearly Amount Spent', data=customers)
sns.jointplot(x='Time on App', y='Yearly Amount Spent', data=customers)
sns.jointplot(x='Time on App', y='Length of Membership', kind='hex', data=customers)

Pairplot

sns.pairplot(customers)

Linear Model Plot

sns.lmplot(x='Length of Membership', y='Yearly Amount Spent', data=customers)

Training and Testing Data

We split the data into training and testing sets to evaluate our model.

Split Data

from sklearn.model_selection import train_test_split

X = customers[['Avg. Session Length', 'Time on App', 'Time on Website', 'Length of Membership']]
y = customers['Yearly Amount Spent']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=101)

Training the Model

We use a Linear Regression model to fit our training data.

Fit Model

from sklearn.linear_model import LinearRegression

lm = LinearRegression()
lm.fit(X_train, y_train)
print('Coefficients: \n', lm.coef_)

Predicting Test Data

We evaluate the model's performance using the test data.

Predictions

predictions = lm.predict(X_test)

plt.scatter(y_test, predictions)
plt.xlabel('Y Test')
plt.ylabel('Predicted Y')

Model Evaluation

We calculate various metrics to evaluate the model's performance.

Metrics

from sklearn import metrics

print('MAE:', metrics.mean_absolute_error(y_test, predictions))
print('MSE:', metrics.mean_squared_error(y_test, predictions))
print('RMSE:', np.sqrt(metrics.mean_squared_error(y_test, predictions)))

sns.distplot((y_test - predictions), bins=50)

Conclusion

To decide whether to focus on the mobile app or the website, we interpret the coefficients of the model.

Coefficients Interpretation

coefficients = pd.DataFrame(lm.coef_, X.columns)
coefficients.columns = ['Coefficient']
coefficients

Insights

Avg. Session Length: A 1 unit increase is associated with a $25.98 increase in yearly spending.
Time on App: A 1 unit increase is associated with a $38.59 increase in yearly spending.
Time on Website: A 1 unit increase is associated with a $0.19 increase in yearly spending.
Length of Membership: A 1 unit increase is associated with a $61.27 increase in yearly spending.

Recommendation

While both the app and website contribute to yearly spending, the app shows a stronger correlation. However, further analysis of the relationship between Length of Membership and the app/website could provide additional insights.

Reference

For detailed code and further analysis, refer to the project notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Ecommerce Customers.csv		Ecommerce Customers.csv
README.md		README.md
how to have better clothing store (how to sell more).ipynb		how to have better clothing store (how to sell more).ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Linear Regression Project - Solutions

Overview

Data Description

Data Import and Initial Inspection

Exploratory Data Analysis

Jointplots

Pairplot

Linear Model Plot

Training and Testing Data

Split Data

Training the Model

Fit Model

Predicting Test Data

Predictions

Model Evaluation

Metrics

Conclusion

Coefficients Interpretation

Insights

Recommendation

Reference

About

Releases

Packages

Languages

strumer69/Shopping_store_Regression

Folders and files

Latest commit

History

Repository files navigation

Linear Regression Project - Solutions

Overview

Data Description

Data Import and Initial Inspection

Exploratory Data Analysis

Jointplots

Pairplot

Linear Model Plot

Training and Testing Data

Split Data

Training the Model

Fit Model

Predicting Test Data

Predictions

Model Evaluation

Metrics

Conclusion

Coefficients Interpretation

Insights

Recommendation

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages