Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mansik95 authored Jun 1, 2020
1 parent f7c0799 commit 0470a4f
Showing 1 changed file with 10 additions and 18 deletions.
28 changes: 10 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,31 @@
IMDB Data Analysis Pipeline
Objective:
# IMDB Data Analysis Pipeline

## Objective:
The aim of the project is to analyse the movies data from multiple sources such as IMDB MoviesLens, The Numbers and BoxOffice Mojo.com based on movies/cast/box office revenues, movie brands and franchises and perform ETL processes using Talend.

Technologies Used:
## Technologies Used:
ER/ Studio
SQL server Developer Edition
Microsoft SQL server Management Studio
Talend Real-Time Data Platform 7.1
Tableau Desktop
Microsoft PowerBI
Dataset Links:

## Dataset Links:
https://datasets.imdbws.com/
https://www.boxofficemojo.com/franchise/?ref_=bo_nb_fr_secondarytab
https://www.boxofficemojo.com/brand/?ref_=bo_nb_frs_secondarytab
https://grouplens.org/datasets/movielens/25m/
https://www.the-numbers.com/movies/franchises
https://www.the-numbers.com/movies/franchise/Marvel-Cinematic-Universe#tab=summary
https://www.the-numbers.com/movie/Avengers-The-(2012)#tab=box-office
Code Walkthrough:

## Code Walkthrough:

Step 1 : Run following script in SSMS to setup the staging database
The Number - stage tables.sql

stg imdb tables - core tables.sql

stg imdb tables expanded part 2.sql

stg_ml_tables.sql

Step 2 : Open Talend and setup your database connections and input file connections
Expand All @@ -33,19 +34,10 @@ When the connections are successful run jobs.
Step 3 : Perform Visualizations in Tableau and PowerBI
Refer to Tableau workbook for checking visualizations and new use cases will be added soon. Microsoft PowerBI file to be added soon.

References:
## References:
https://elearning.tableau.com/
https://help.talend.com/reader/KxVIhxtXBBFymmkkWJ~O4Q/8RlpZdAdKhP0IaMHXRV7yw
https://www.talend.com/
https://grouplens.org/datasets/movielens/
stg imdb tables - core tables.sql

stg imdb tables expanded part 2.sql

stg_ml_tables.sql

Open Talend and setup your database connections and input file connections

When the connections are successfull run the master job

Perform Visualizations in Tableau and PowerBI

0 comments on commit 0470a4f

Please sign in to comment.