Skip to content

s-bolz/Getting-and-Cleaning-Data-Course-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Getting and Cleaning Data - course project README

This project is the course project of the MOOC (massive open online course) Getting and Cleaning Data which is offered by the Johns Hopkins Bloomberg School of Public Health and distributed via Coursera. It takes the Human Activity Recognition Using Smartphones Data Set from the UCI Machine Learning Repository as published in [1] and transforms the data into a tidy data set. That data set is written into the file tidy-dataset.txt.

Prerequisites for the script run_analysis.R

The project consists of a single R script run_analysis.R which creates the tidy data set. The only prerequisites for the script are

  • the downloaded and extracted raw data
  • a running installation of R
  • the installation of the additional R library reshape2

Running the script run_analysis.R

To create the tidy data set you need to do the following:

  1. download the raw data from here
  2. unzip the downloaded file
  3. open a terminal and change into the directory that contains the unzipped subdirectory UCI HAR Dataset
  4. run the R script run_analysis.R from within the directory into which you just navigated, e.g. Rscript run_analysis.R

When the script has finished you can find the tidy data set in a file called tidy-dataset.txt in the directory from which you started the script.

Workflow of the script run_analysis.R

The script does the following (quoted from the Coursera course project assignment):

  1. Merges the training and the test sets to create one data set.
  2. Extracts only the measurements on the mean and standard deviation for each measurement.
  3. Uses descriptive activity names to name the activities in the data set
  4. Appropriately labels the data set with descriptive variable names.
  5. Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

The tidy data set is written into the file tidy-dataset.txt in the directory where the script has been started.

References

[1] Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra and Jorge L. Reyes-Ortiz. Human Activity Recognition on Smartphones using a Multiclass Hardware-Friendly Support Vector Machine. International Workshop of Ambient Assisted Living (IWAAL 2012). Vitoria-Gasteiz, Spain. Dec 2012

About

Course project for the Coursera MOOC "Getting and Cleaning Data"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages