This respository contains code to foecast data of numerous sensors. Data cleaning and prediction task is done step by step to perform the whole work. In develop branch all of the code will be found.
- Python 3.6
- Code is tested on Ubuntu 16.04, 18.04, Windows 10
- Use requirements.txt file to install necessary library for this project by using
pip install -r requirements.txt
command in the terminal. To make this requirements.txt file I have used the information from this repository.
The idea behind the task is to observe reaction of different machine learning models to the provided data from Salzgitter AG, a reputed steel industry of Germany. Provided data contains information regarding the integrated power plant of Salzgitter AG. Here, I have tried to forecast Turbine data for each minute. Provided data is in time-series format. Initially, all of the raw data is cleaned and visualized using Pandas, NumPy etc. Then, stationarity of time series is checked by ADF test. ARIMA, Linear regression, Decision Tree Regression, Neural Network, Long Short Term Memory(LSTM) are used to do the forecasting.
- Time series explanation by IBM
- TIme series analysis by Duke
- LSTM from Cohla's blog
- LSTM explanation by Shi Yan
- LSTM explanation by Michael Nguyen
- After cloning this repository you have to set the name of the csv file where your data is stored. For doing this just take a look in this configuration file
- In main.py replace the variable name. If this line is not found just try to find dataframe read variable name and use that to read your csv file
- This file is used for data preprocessing
- All of the machine learning model is demonstrated here
- To run code in your favourite IDE/ terminal execute python main.py