Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Notebook and Template For Global Forecasting API #6699

Merged
merged 128 commits into from
Aug 10, 2024

Conversation

XinyuWuu
Copy link
Member

@XinyuWuu XinyuWuu commented Jul 1, 2024

To close #6575 and #6684.

The notebook is originally from #6551.

Copy some discussion:


fkiraly:

Great! FYI @benHeid, have you seen and reviewed this?

I mainly have comments about the notebook.

  • could you separate this into another PR? The notebook is great, but there is some iteration needed regarding location and integration with the other notebooks. I think we can merge the forecasters already and don't need to delay.

Some minor comments on the notebook content:

  • there are a lot of printouts that confuse the reader. These should be silenced so you can focus on the didactic content.
  • the markdown text is nice, I would just format it so the lines are not too long, and I would also use shorter telegram style, like on ppt slides

Regarding location, I would actually add the content to the notebook 01c, which has some content already. There is some minor confusion in the notebook about terminology, the notebook also uses the term "global" but in a different way.

  • in the M5 paper, what the 01c notebook does is called "cross-learning"
  • the "global" in current 01d is more of a pre-training on other instances

In any case, we need to disambiguate terminology and perhaps adopt clearer distinctions here. @benHeid, what do you suggest on how we handle the terminology clash between 01c and 01d? And, should this go in the same notebook, so the "multiple instances" cases can be explained easily?

Originally posted by @fkiraly in #6551 (review)


shlok191:

@fkiraly, @Xinyu-Wu-0000. Yes, I have some minor input! In the 01d_forecasting_global_forecast.ipynb notebook and the 01c_forecasting_hierarchical_global.ipynb notebooks, "global learning" is referred to as a term. I do agree that instead of using "global learning", we can instead use "pretrained" and "cross-learning" as replacements.

Here is how I'd differentiate them:

  • To the best of my knowledge, pre-trained models do not require the training dataset time series to be correlated to each other

  • Referencing this paper on the M5 competition, I believe "cross-learning" is the term utilized for training models on time series which have a strong correlation. I think M5 was referenced in 01c, so it might be good to use "cross-learning" there instead of "global learning"!

Originally posted by @shlok191 in #6551 (comment)

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@geetu040
Copy link
Contributor

@Xinyu-Wu-0000 I may ask why you created the split like this (1)

data = _make_hierarchical(
    (5, 100),
    n_columns=2,
    max_timepoints=10,
    min_timepoints=10,
)
l1 = data.index.get_level_values(1).map(lambda x: int(x[3:]))
X_train = data.loc[l1 < 90, "c0"].to_frame()
y_train = data.loc[l1 < 90, "c1"].to_frame()
X_test = data.loc[l1 >= 80, "c0"].to_frame()
y_test = data.loc[l1 >= 80, "c1"].to_frame()
y_test = y_test.groupby(level=[0, 1]).apply(lambda x: x.droplevel([0, 1]).iloc[:-3])

instead of doing it like this? (2)

y_train, y_test, X_train, X_test = temporal_train_test_split(y, X)

the later approach splits at the inner-most level and the former approach splits at 2nd level. Is there a reason for choosing this level? is it related to global forecasting in some way?

@XinyuWuu
Copy link
Member Author

XinyuWuu commented Jul 12, 2024

Is there a reason for choosing this level? is it related to global forecasting in some way?

Yes, global forecasting is the reason. The second approach splits the data at the time index level but first approach splits the data at the instance level (1st level or 2nd level). Global forecasting is about fitting and predicting on different instances.

Copy link
Member

@yarnabrina yarnabrina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made two comments on the extension template. For some reason, I can not see the notebook in Github UI, it says Unable to render rich display.

I've a off-the-topic question about global forecasting. Suppose we train a model for series 1 and 2, and global forecasting should be able to predict for 3, 4, etc. as well. If I pass no y in predict, and if I pass (just 1) or (just 2) or (1 and 2 together), are we expecting same predictions in both cases, if random_state like parameters are set?

}

# todo: add any hyper-parameters and components to constructor
def __init__(self, parama, paramb="default", paramc=None, broadcasting=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting that broadcasting must be a parameter, or do you want to enforce it must be the last parameter? Currently it may suggest all model specific parameters must be added before mandatorily adding broadcasting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not suggesting to enforce it.

# (for_global)
self.broadcasting = broadcasting
if self.broadcasting:
self.set_tags(
**{
"y_inner_mtype": "pd.Series",
"X_inner_mtype": "pd.DataFrame",
"capability:global_forecasting": False,
}
)
# If you are extending an existing forecaster to global mode, you migth
# need to use the broadcasting parameter to reserve the original behavior.
# You can use deprecation cycle to switch the default behavior.
# How deprecation works in sktime can be found at https://www.sktime.net/en/stable/developer_guide/deprecation.html

descriptive explanation of paramb
paramc : boolean, optional (default= whether paramb is not the default)
descriptive explanation of paramc
and so on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will suggest to add broadcasting here, and mention what is the role of that parameter to new contributors.

Question: when we make changes to an existing estimator, adding a boradcasting control makes sense. If we get a new estimator, specifically for global forecasting, do we still enforce this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will suggest to add broadcasting here, and mention what is the role of that parameter to new contributors.

Yeah, I will add some explanation here.

Question: when we make changes to an existing estimator, adding a boradcasting control makes sense. If we get a new estimator, specifically for global forecasting, do we still enforce this?

I think it depends on specific forecaster not enforced.

@XinyuWuu
Copy link
Member Author

XinyuWuu commented Aug 2, 2024

For some reason, I can not see the notebook in Github UI, it says Unable to render rich display.

You may want to use the view file option.

image

@XinyuWuu
Copy link
Member Author

XinyuWuu commented Aug 2, 2024

I've a off-the-topic question about global forecasting. Suppose we train a model for series 1 and 2, and global forecasting should be able to predict for 3, 4, etc. as well. If I pass no y in predict, and if I pass (just 1) or (just 2) or (1 and 2 together), are we expecting same predictions in both cases, if random_state like parameters are set?

I think it depends on the forecaster. If it's a simple full connected network, I am expecting it to give the same predictions. If there are dropout layers or other random layers, I am not sure it will give the same output. Even some convolution layers and RNN layers or some simple operations like torch.Tensor.index_add_() could be nondeterministic. According to pytorch documents, it seems to be quite complicated. Even if random_state like parameters are set, I am not sure we can achieve determinstic output. The order series are predicted could alse affect the output. If we first predict on series 1, the random_state of Random Generator will change, then the following predictions are affected by the first prediction. If we pass multiple series together, how they will be batched could be really complicated.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some minor changes related to terminology and formatting, hope this is ok.

@fkiraly fkiraly merged commit b1d951d into sktime:main Aug 10, 2024
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[DOC] Global Forecast Example
4 participants