Titanic

Analysis and predictions of the Titanic data set from Kaggle

1. Goals and Assumptions

Primary goal: Predict the survival of passengers in the test group

Of the various features (e.g. age, social class, ticket fare), engineer them to be suitable to be fed into a machine learning algorithm
Train various machine learning models to make predictions of survival
Compare the outcome of the models based on metrics like classifcation reports and confusion matrices

Secondary goal: Determine which features are the most important predictors.

Trial/Error: What happens to the predictions when particular features are excluded from the training model (are they improved or deteriorated)?

Secondary goal: Determine why the best performing model fits the data.

Statistical theory: What is the intended use of the model, and what attributes of the data make it an appropriate fit?

2. Exploratory Data Analysis

Primary goal: Find clues to meaningful relationships amongst the data: Identify critical predictors.

Which features have a stark impact on survival?
Display these relationships in a simple, obvious manner for a non-technical audience (i.e. visualization).

Secondary goal: Demonstrate interesting relationships amongst the data, even if they do not correlate to critical predictors.

The definition of "interesting" will depend on the intended audience (e.g. cruiseline selling tickets for, or shipyard building Titanic MarkII; agency conduction safety investigations or social bias in state of emergency).

3. Data Cleaning, Feature Engineering

Primary goal: Prepare the features to be fed into machine learning algorithm.

Some of the predictors are binary or class-based.

4. Valuation, Machine Learning Analysis

In this case, I am building a classifier which means some statistical models are less appropriate.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
README.md		README.md
Submission1.ipynb		Submission1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Titanic

1. Goals and Assumptions

Primary goal: Predict the survival of passengers in the test group

Secondary goal: Determine which features are the most important predictors.

Secondary goal: Determine why the best performing model fits the data.

2. Exploratory Data Analysis

Primary goal: Find clues to meaningful relationships amongst the data: Identify critical predictors.

Secondary goal: Demonstrate interesting relationships amongst the data, even if they do not correlate to critical predictors.

3. Data Cleaning, Feature Engineering

Primary goal: Prepare the features to be fed into machine learning algorithm.

4. Valuation, Machine Learning Analysis

About

Releases

Packages

Languages

prdctofchem/Titanic

Folders and files

Latest commit

History

Repository files navigation

Titanic

1. Goals and Assumptions

Primary goal: Predict the survival of passengers in the test group

Secondary goal: Determine which features are the most important predictors.

Secondary goal: Determine why the best performing model fits the data.

2. Exploratory Data Analysis

Primary goal: Find clues to meaningful relationships amongst the data: Identify critical predictors.

Secondary goal: Demonstrate interesting relationships amongst the data, even if they do not correlate to critical predictors.

3. Data Cleaning, Feature Engineering

Primary goal: Prepare the features to be fed into machine learning algorithm.

4. Valuation, Machine Learning Analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages