Skip to content

shenghuayou/MTA-Turnstile

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MTA-Turnstile

The goal of the project is to analyze the data gathered by the MTA from their turnstiles which would allow us to analyze any trends that occur throughout different parts of the year, such as by month, days of the week and weather.

Group Members

Dependencies

  • Languages: Python, Javascript
  • Framework: Spark
  • Cluster: Hadoop, HDFS (HUE)

Datasets

About this repository

Running on the Hadoop with Sparks and accessing files in HDFS

  • Make the python file executable
Command:
$ hadoop fs -chmod +x <your python file location>

Example:
$ hadoop fs -chmod +x /user/vfung000/project/python-code.py
Command:
$ spark-submit --name <name of job> \
               --num-executors <number> \
                <python code location>

Example:
$ spark-submit --name "projWeatherDayweek" \
               --num-executors 10 \
               hdfs:///user/vfung000/project/HadoopPHD.py \

About

Final Project for Big Data Management, Spring 2017

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published