-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Notebook and Template For Global Forecasting API #6699
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@Xinyu-Wu-0000 I may ask why you created the split like this (1) data = _make_hierarchical(
(5, 100),
n_columns=2,
max_timepoints=10,
min_timepoints=10,
)
l1 = data.index.get_level_values(1).map(lambda x: int(x[3:]))
X_train = data.loc[l1 < 90, "c0"].to_frame()
y_train = data.loc[l1 < 90, "c1"].to_frame()
X_test = data.loc[l1 >= 80, "c0"].to_frame()
y_test = data.loc[l1 >= 80, "c1"].to_frame()
y_test = y_test.groupby(level=[0, 1]).apply(lambda x: x.droplevel([0, 1]).iloc[:-3]) instead of doing it like this? (2) y_train, y_test, X_train, X_test = temporal_train_test_split(y, X) the later approach splits at the inner-most level and the former approach splits at 2nd level. Is there a reason for choosing this level? is it related to global forecasting in some way? |
Yes, global forecasting is the reason. The second approach splits the data at the time index level but first approach splits the data at the instance level (1st level or 2nd level). Global forecasting is about fitting and predicting on different instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made two comments on the extension template. For some reason, I can not see the notebook in Github UI, it says Unable to render rich display
.
I've a off-the-topic question about global forecasting. Suppose we train a model for series 1 and 2, and global forecasting should be able to predict for 3, 4, etc. as well. If I pass no y
in predict, and if I pass (just 1) or (just 2) or (1 and 2 together), are we expecting same predictions in both cases, if random_state like parameters are set?
} | ||
|
||
# todo: add any hyper-parameters and components to constructor | ||
def __init__(self, parama, paramb="default", paramc=None, broadcasting=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting that broadcasting
must be a parameter, or do you want to enforce it must be the last parameter? Currently it may suggest all model specific parameters must be added before mandatorily adding broadcasting
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not suggesting to enforce it.
sktime/extension_templates/forecasting_global_supersimple.py
Lines 116 to 129 in 1a33227
# (for_global) | |
self.broadcasting = broadcasting | |
if self.broadcasting: | |
self.set_tags( | |
**{ | |
"y_inner_mtype": "pd.Series", | |
"X_inner_mtype": "pd.DataFrame", | |
"capability:global_forecasting": False, | |
} | |
) | |
# If you are extending an existing forecaster to global mode, you migth | |
# need to use the broadcasting parameter to reserve the original behavior. | |
# You can use deprecation cycle to switch the default behavior. | |
# How deprecation works in sktime can be found at https://www.sktime.net/en/stable/developer_guide/deprecation.html |
descriptive explanation of paramb | ||
paramc : boolean, optional (default= whether paramb is not the default) | ||
descriptive explanation of paramc | ||
and so on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will suggest to add broadcasting
here, and mention what is the role of that parameter to new contributors.
Question: when we make changes to an existing estimator, adding a boradcasting control makes sense. If we get a new estimator, specifically for global forecasting, do we still enforce this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will suggest to add
broadcasting
here, and mention what is the role of that parameter to new contributors.
Yeah, I will add some explanation here.
Question: when we make changes to an existing estimator, adding a boradcasting control makes sense. If we get a new estimator, specifically for global forecasting, do we still enforce this?
I think it depends on specific forecaster not enforced.
I think it depends on the forecaster. If it's a simple full connected network, I am expecting it to give the same predictions. If there are dropout layers or other random layers, I am not sure it will give the same output. Even some convolution layers and RNN layers or some simple operations like torch.Tensor.index_add_() could be nondeterministic. According to pytorch documents, it seems to be quite complicated. Even if random_state like parameters are set, I am not sure we can achieve determinstic output. The order series are predicted could alse affect the output. If we first predict on series 1, the random_state of Random Generator will change, then the following predictions are affected by the first prediction. If we pass multiple series together, how they will be batched could be really complicated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some minor changes related to terminology and formatting, hope this is ok.
To close #6575 and #6684.
The notebook is originally from #6551.
Copy some discussion:
fkiraly:
Great! FYI @benHeid, have you seen and reviewed this?
I mainly have comments about the notebook.
Some minor comments on the notebook content:
Regarding location, I would actually add the content to the notebook 01c, which has some content already. There is some minor confusion in the notebook about terminology, the notebook also uses the term "global" but in a different way.
In any case, we need to disambiguate terminology and perhaps adopt clearer distinctions here. @benHeid, what do you suggest on how we handle the terminology clash between 01c and 01d? And, should this go in the same notebook, so the "multiple instances" cases can be explained easily?
Originally posted by @fkiraly in #6551 (review)
shlok191:
@fkiraly, @Xinyu-Wu-0000. Yes, I have some minor input! In the
01d_forecasting_global_forecast.ipynb
notebook and the01c_forecasting_hierarchical_global.ipynb
notebooks, "global learning" is referred to as a term. I do agree that instead of using "global learning", we can instead use "pretrained" and "cross-learning" as replacements.Here is how I'd differentiate them:
To the best of my knowledge, pre-trained models do not require the training dataset time series to be correlated to each other
Referencing this paper on the M5 competition, I believe "cross-learning" is the term utilized for training models on time series which have a strong correlation. I think M5 was referenced in 01c, so it might be good to use "cross-learning" there instead of "global learning"!
Originally posted by @shlok191 in #6551 (comment)