Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial branch #10

Merged
merged 9 commits into from
May 6, 2021
Prev Previous commit
Next Next commit
update kats 204
  • Loading branch information
iamxiaodong committed May 6, 2021
commit 48baf39cc3b8d781fbdd390b8496acf16599a7aa
134 changes: 134 additions & 0 deletions tutorials/kats_204.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
{
"metadata": {
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
},
"orig_nbformat": 2,
"kernelspec": {
"name": "python383jvsc74a57bd05b6e8fba36db23bc4d54e0302cd75fdd75c29d9edcbab68d6cfc74e7e4b30305",
"display_name": "Python 3.8.3 64-bit"
},
"metadata": {
"interpreter": {
"hash": "5b6e8fba36db23bc4d54e0302cd75fdd75c29d9edcbab68d6cfc74e7e4b30305"
}
}
},
"nbformat": 4,
"nbformat_minor": 2,
"cells": [
{
"source": [
"<h1><center>Kats 204 Forecasting with Meta Learning</center></h1>"
],
"cell_type": "markdown",
"metadata": {}
},
{
"source": [
"We proposed a meta-learning framework for forecasting predictability, model selection and hyper-parameters tuning (HPT) through a supervised learning perspective, which provides accurate forecasts with lower computational time and resources. \n",
"\n",
"It first classifies whether a time series is predictable or not (i.e. whether we can get a good forecasting without much human-engineering efforts. Then it uses classification algorithm to predict the best-performing model from features extracted from given time series data sets, and uses a multi-tasks neural network model to predict the best parameters for a given model and time series data. \n",
"\n",
"The meta-learning framework contains four steps:\n",
"\n",
"1. Meta-data collection. Exhaustive tuning to obtain the best performed model for a given data, and hyper-parameters for each model and data combination. \n",
"2. Predict whether a time series is predicatable or not. This is a examination step, which aims to inform user whether the target time series can be easily fitted by a single model or more delicated human-engineering efforts are needed.\n",
"\n",
"3. Predict forecasting model for the target time series.\n",
"\n",
"4. Predict hyper-parameters for the target time series.\n",
"\n",
"Kats provides APIs for all these steps.\n",
"\n",
"\n"
],
"cell_type": "markdown",
"metadata": {}
},
{
"source": [
"## 1. **Meta-data collection**\n",
"\n",
"Extract the meta-data of a time series, which includes:\n",
"\n",
"1. the hyper-parameters and the corresponding error metrics of 6 best candidate models after hyper-parameter tuning; \n",
"\n",
"2. 40 time series features; \n",
"\n",
"3. search method for hyper-parameter tuning (e.g., random search, grid search or Bayesian Optimal Search);\n",
"\n",
"4. error metrics used in evaluating hyper-parameters (MAE is defualt and MAPE is recommended.).\n",
"\n",
"Paremeters for GetMetaData() class:\n",
"* **data**: TimeSeriesData\n",
"* **all_models**: Dict\\[str, m.Model], a dictionary that includes all candidate models.\n",
"* **all_params**: Dict\\[str, Params], a dictionary that includes all candidate hyper-params corresponding to all candidate models.\n",
"* **min_length**: int, minimal length of time series. Time series data which is too short will be excluded.\n",
"* **scale**: bool, It indicates whether to scale TS in order to get a more comparable feature vector. If it's true, each value of TS will be divided by the max value of TS.\n",
"* **method**: SearchMethodEnum, Search method for hyper-parameters tuning.\n",
"* **executor**: Any, A callable parallel executor. Tune individual model in candidate models parallel. The default executor is native implementation with Python's multiprocessing.\n",
"* **error_method**: str, Type of error metric. Only support mape, smape, mae, mase, mse, rmse.\n",
"* **num_trials**: optional, Number of trials in RandomSearch.\n",
"* **num_arm**: optional, Number of arms in RandomSearch."
],
"cell_type": "markdown",
"metadata": {}
},
{
"source": [
"### 1.1 Collect meta-data from a time series"
],
"cell_type": "markdown",
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"output_type": "error",
"ename": "ModuleNotFoundError",
"evalue": "No module named 'fbprophet'",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-1-2f7063eddc3b>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mkats\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconsts\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mTimeSeriesData\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0;32mfrom\u001b[0m \u001b[0mkats\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmodels\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmetalearner\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_metadata\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mGetMetaData\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 7\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0;31m#load data and transform it into TimeSeriesData\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/fb_internal/Kats/kats/models/metalearner/get_metadata.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 24\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mpandas\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpd\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 25\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mkats\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconsts\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mParams\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mSearchMethodEnum\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mTimeSeriesData\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 26\u001b[0;31m \u001b[0;32mfrom\u001b[0m \u001b[0mkats\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmodels\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0marima\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mholtwinters\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprophet\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msarima\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mstlf\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtheta\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 27\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mkats\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtsfeatures\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtsfeatures\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mTsFeatures\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 28\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;32m~/fb_internal/Kats/kats/models/prophet.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mkats\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmodels\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmodel\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mm\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 11\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mpandas\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mpd\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 12\u001b[0;31m \u001b[0;32mfrom\u001b[0m \u001b[0mfbprophet\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mProphet\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 13\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0mkats\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mconsts\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mParams\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mTimeSeriesData\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 14\u001b[0m from kats.utils.parameter_tuning_utils import (\n",
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'fbprophet'"
]
}
],
"source": [
"import pandas as pd\n",
"import sys\n",
"sys.path.append(\"../\")\n",
"\n",
"from kats.consts import TimeSeriesData\n",
"from kats.models.metalearner.get_metadata import GetMetaData\n",
"\n",
"#load data and transform it into TimeSeriesData\n",
"data = pd.read_csv(\"../data/air_passengers.csv\")\n",
"data.columns = [\"time\", \"y\"]\n",
"TSdata = TimeSeriesData(data)\n",
"\n",
"# create an object MD of class GetMetaData\n",
"MD = GetMetaData(data=TSdata, error_method='mape')\n",
"\n",
"# get meta data, as well as search method and type of error metric\n",
"my_meta_data = MD.get_meta_data()"
]
}
]
}