Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update metric to handle y_train #858

Merged
merged 27 commits into from
Jun 3, 2021
Merged

Conversation

RNKuhns
Copy link
Contributor

@RNKuhns RNKuhns commented May 4, 2021

Reference Issues/PRs

This addresses functionality in #712.

What does this implement/fix? Explain your changes.

Add functionality to accept y_train in __call__ method and pass to underlying function if it requires y_train.

Also updated underlying metric classes to inherit from BaseMetric.

Does your contribution introduce a new dependency? If yes, which one?

What should a reviewer concentrate their feedback on?

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.
For new estimators
  • I've added the estimator to the online documentation.
  • I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

@RNKuhns
Copy link
Contributor Author

RNKuhns commented May 7, 2021

@mloning and @fkiraly the more I'm looking at this while looking to simplify the unit tests for these metrics, it makes more and more sense to me that the metric functions in _functions.py should also get updated to accept y_train via 1**kwargs` so we have a uniform interface.

Difference to the user passing info to functions will be relatively minimal.

Benefit is that y_train would be passed same way to function as the __call__ of the class version of the metric:

mean_absolute_scaled_error(y_true, y_pred, y_train=y_train, ... # Any other keyword args like multioutput)
mase = MeanAbsoluteScaledError()
mase(y_true, y_pred, y_train=y_train)

versus the current function setup accepting y_train as arg and class version of metric accepting it as keyword:

mean_absolute_scaled_error(y_true, y_pred, y_train, ... # Any other keyword args like multioutput)
mase = MeanAbsoluteScaledError()
mase(y_true, y_pred, y_train=y_train)

Note that I'll do whatever we decide for y_train for metrics that accept y_pred_benchmark.

I'll I've got this coded up but wanted to get your feedback on the approach before committing and pushing to Github.

@fkiraly
Copy link
Collaborator

fkiraly commented May 11, 2021

@mloning and @fkiraly the more I'm looking at this while looking to simplify the unit tests for these metrics, it makes more and more sense to me that the metric functions in _functions.py should also get updated to accept y_train via 1**kwargs` so we have a uniform interface.

I've got this coded up but wanted to get your feedback on the approach before committing and pushing to Github.

Yes, very much agreed! I think this indeed makes unit tests easier, and any other exercise that requires a unified interface on the function level.

@RNKuhns
Copy link
Contributor Author

RNKuhns commented May 12, 2021

@fkiraly that sounds great. I think this should basically be ready for review. But I see the PR is failing the manylinux build. @mloning the details look like there is an issue with numba versioning in the linux builds, but not entirely sure what is going on there.

@RNKuhns RNKuhns marked this pull request as ready for review May 12, 2021 00:29
@RNKuhns RNKuhns requested review from aiwalter and mloning as code owners May 12, 2021 00:29
Copy link
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RNKuhns - the manylinux CI will hopefully be fixed in #870.

I had a first look at the code and left a few minor comments below.

sktime/performance_metrics/forecasting/_classes.py Outdated Show resolved Hide resolved
sktime/performance_metrics/forecasting/_functions.py Outdated Show resolved Hide resolved
sktime/performance_metrics/forecasting/_functions.py Outdated Show resolved Hide resolved
sktime/utils/validation/forecasting.py Outdated Show resolved Hide resolved
sktime/utils/validation/forecasting.py Outdated Show resolved Hide resolved
sktime/utils/validation/forecasting.py Outdated Show resolved Hide resolved
Copy link
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RNKuhns I had look at the changes, I think we're almost there, just a few more comments.

sktime/performance_metrics/base/_base.py Show resolved Hide resolved
sktime/performance_metrics/forecasting/_functions.py Outdated Show resolved Hide resolved
sktime/utils/validation/forecasting.py Outdated Show resolved Hide resolved
@RNKuhns
Copy link
Contributor Author

RNKuhns commented May 19, 2021

@mloning I think I've incorporated all your comments. As a bonus I've tweaked all the docstrings to better align with NumPy conventions and pass the pydocstyle checks.

I also opened an issue related to the creation of a BaseObject per our discussion above. I can start tackling that next.

@RNKuhns RNKuhns mentioned this pull request May 19, 2021
mloning
mloning previously approved these changes May 20, 2021
Copy link
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good to me - I'll merge in the next few days in case anyone else wants to take a look!

Copy link
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @RNKuhns, we should also update evaluate in sktime.forecasting.model_evaluation to pass y_train as a keyword argument to the scoring: https://github.com/alan-turing-institute/sktime/blob/c611e6a3587d7b3a44cb7deefd4b6baa4897fc9b/sktime/forecasting/model_evaluation/_functions.py#L106

If we do that here, we should also add a metric that requires y_train to the current test cases here (perhaps add MASE and remove MSE which doesn't add much as an additional test case): https://github.com/alan-turing-institute/sktime/blob/c611e6a3587d7b3a44cb7deefd4b6baa4897fc9b/sktime/forecasting/model_evaluation/tests/test_evaluate.py#L80

@RNKuhns
Copy link
Contributor Author

RNKuhns commented May 22, 2021

@mloning I'm working on update to evaluate.

Ran into two minor hitches. First is that in the call to scoring within evaluate() the y_true and y_pred arguments were flipped (doesn't really matter for most metrics, but I fixed it).

The bigger hitch is that the test cases include in-sample predictions. In MeanAbsoluteScaledError() we have a check to make sure that y_train is before y_true. This makes sense in the context of forecast evaluation in general (and MASE's definition). But some of the test cases are looking at in-sample predictions and in those cases the check is failed in those cases; hence, the tests fail.

I can think of two options for approaching this:

  1. We have separate test function for MASE that just tests the common configs with only the out-of-sample forecasting horizons or just pass on the tests with MASE when the forecasting horizon is negative
  2. We remove the check in mean_absolute_scaled_error to require y_train prior to y_true

I am in the camp that MASE is designed to be applied when evaluating out-of-sample forecasts so we just include a line of code to exclude tests of evaluate for MASE when the forecasting horizon is negative (in-sample). That still ensures the MASE functionality is working in a way that makes sense for users (the whole evaluating in-sample forecasts is not a good gauge of the model's future predictive ability thing). But what do you think?

I could also figure out how to add a quick check in evaluate to raise an informative error if the user tries to use a metric that requires y_train with a CV object with a negative forecasting horizon (I'll also add note in docstring).

RNKuhns added 3 commits May 23, 2021 09:55
Test tune was creating scorer from scikit-learn
mean_squared_error. Using sktime metric class now.

Also updated documentation of
make_forecasting_scorer to make input function
signature clear.
@mloning
Copy link
Contributor

mloning commented May 24, 2021

@RNKuhns I agree, but perhaps the check that y_train is prior to y_true is a bit over eager and we make everything a bit simpler by not enforcing that. In principle MASE still works for in-sample predictions, so perhaps we should exclude that option. This moves the responsibility to users for making sure they're evaluating models on genuine forecasts which I think is fine. But happy to follow your lead here @RNKuhns.

@fkiraly
Copy link
Collaborator

fkiraly commented May 25, 2021

@RNKuhns, I'm with @mloning on this one - in my opinion, the metric should just model the "bare" mathematical/scientific object, not make any checks regarding a plausible use case. This is because user expectations - they will expect the code object to behave as the scientific one, and because of the domain driven design principle to follow this mapping.

A secondary point - in my opinion not the main argument here, but it's to the same effect - is that it can make sense for a metric in-principle to have training and test time points overlapping, for example when you are trying to estimate over-optimism of in-sample estimates. We shouldn't exclude a rare sensible use case.

Copy link
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RNKuhns - looks all good to me now! Will merge in the next few days in case anyone else wants to take a look.

@mloning mloning merged commit 8d0b94b into sktime:main Jun 3, 2021
@mloning
Copy link
Contributor

mloning commented Jun 3, 2021

Now merged - great work @RNKuhns 🎉

mloning added a commit that referenced this pull request Sep 5, 2021
* fixes and capabilities

* stc contract name

* arsenal and hive-cote tests

* linting

* linting2

* hc1 test imports

* hc1 test config update

* arsenal proba rounding

* doc strings for boss classifier

* loop variable changed to i

* boss and cboss doc string

* boss and cboss docstring 2

* muse doc strings 1

* muse public to supposedly private methods

* docstrings TDE

* dictionary based doc strings

* dictionary based doc strings 2

* distance based doc strings (EE) v1.

* distance based docs (PF) 2

* Distance based docs (PF) 2

* Docstrings distance based PF

* distance based PF 4

* Doc strings for distance based

* doc string distance based

* doc strings hybrid classifiers

* docstring interval and hybrid

* all classifiers doc string first pass

* docstring in contrib v1

* docstrings contrib2 - remove shape_dtw experiments file

* doc  strings final?

* doc string

* Remove step_length hyper-parameter from reduction classes (#900)

* Update performance metrics to handle y_train (#858)

* Updated forecasting metrics to handle y_train

* Unified interface for metric functions and classes

* Raise NotImplementedError in check_scoring for
metrics that require y_pred_benchmark

* Switched order of check_scoring checks

* Updated check_scoring handling of y_pred_benchmark

* Changed reference of loss func or class to metric

* Tweaked tags and improved error handling

* Added comment to BaseMetric about _all_tags

* Tweaked _get_kwarg and docstrings

* Tweaked docstrings

* Fixed check_scoring validation

* Added @RNKuhns as performance metric code owner

* Fixed docstrings

* Added config to skip test files in pydocstyle

* Tweaked performance_metric docs

* Switched doc automodule back to prior setup

* Update evaluate to work with new metric interface

* Remove make_forecasting_scorer from test_tune

Test tune was creating scorer from scikit-learn
mean_squared_error. Using sktime metric class now.

Also updated documentation of
make_forecasting_scorer to make input function
signature clear.

* Added comment clarifying tests using MASE

* Removed check on y_train in scaled metrics

* Tweak performance metrics used in test_evaluate

Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Fix fbprophet type conversion (#911)

* Fix https://github.com/alan-turing-institute/sktime/issues/910

* Docstrings fix

Co-authored-by: Martin Walter <mf-walter@web.de>

* Add plot_correlations() to plot series and acf/pacf (#850)

* Add plot_correlations to plot series and acf/pacf

* Fixed output of plot_series to handle kwarg ax

* Added input check to plot_correlations

* Updated docstrings to match pydocstyle conventions

* Updated plot_series docstring default args

* Add kwargs to set axes titles to plot_series

* Added dependency on pytest-mply plot tests

* Add baseline images for plot comparison unit tests

* Updated setup.cfg to exlude tests from pydocstyle

* Added plot_correlations to docs

* Removed pytest-mpl dependency and tests

* Update plotting unit tests

* Fixed plot_correlations docs

Co-authored-by: Markus Löning <markus.loning@gmail.com>

* minor bugfix - reducer sets _is_fitted to False before input checks

* same for multiplexer

* and pmdarima

* added initial _is_fitted=False everywhere now

* also _is_fitted done in prophet.fit

* detrender

* deseasonalizer

* BaseGridSearch

* Forecast base class refactor and extension template (#912)

This PR is addressing the forecaster base class proliferation discussed in #510, by simplifying the base class inheritance tree and the streamlining the private method logic.

Specifically, I've moved the logic from `_SktimeForecaster` and the forecasting horizin mixins into `BaseForecaster`, and adopted uniformly the principle of having a public method (`fit`, `predict`, etc) with checks/plumbing, dispatching to a "core logic" version (`_fit`, `_predict`, etc) where validated arguments can be assumed.

The burden on extenders becomes much lighter, since it is now possible to only focus on the "core logic" implementation, instead of having to keep in mind a myriad of inconsistent and constantly shifting conventions in checks and other plumbing.

As side effects, if we get this right, this should make a few things on the roadmap easier:
* the extension guidelines #464. Right now, the aforementioned implicit conventions are too many and intricate to write useful extension guidelines, in my opinion.
* extending to the multivariate case
* input/output checks and the eternal data container discussion, this can go in the plumbing

As a proof-of-concept regarding ease of extension, this PR also contains a highly annotated extension template in the `extension_templates` folder.

In terms of review, the key file is `forecasting/base/_base`, with corresponding changes (contraction and deletion) in `base/_sktime`.

I've tried to keep the interface consistent as much as possible (only changing internal logic).

Interface contracts with all the earlier estimators are still honoured, via loopthroughs and default behaviour that ensures that everything still works if `fit`, `predict` etc are overridden by the current descendants, as opposed to `_fit` and `_predict`.
The only change I had to made to descendants is set `self._is_fitted=False` at the start of `fit`, which is a minimally invasive change that's also separately reviewable as PR #941.

* Properly process random_state when fitting Time Series Forest ensemble in parallel (#819)

* Properly process random_state when fitting ensemble in parallel

* Fix test for a new random_state initialization

* Suggested changes

Co-authored-by: Oleksii Kachaiev <okachaiev@riotgames.com>
Co-authored-by: Markus Löning <markus.loning@gmail.com>
Co-authored-by: mloning <markus.loning.17@ucl.ac.uk>
Co-authored-by: Matthew Middlehurst <pfm15hbu@uea.ac.uk>

* Exemplary concrete estimator refactor post interface refactor, of NaiveForecaster (#953)

This is an exemplary refactor of a concrete estimator class, `NaiveForecaster`, to explore how general concrete forecaster refactors would work, along the lines discussed in #912.

This PR changes:

* `_BaseWindowForecaster` still inherits from `BaseForecaster` directly, and already looked extension spec compliant (no overrides, no tags)
* `NaiveForecaster` inherits from `_BaseWindowForecaster`, and has been made extension spec compliant by adding the `requires-fh-in-fit` tag, and moving core logic from `fit` to `_fit`, while avoiding to override `fit`
* it adds one line in `BaseForecaster.fit`, as a general change: the `cutoff` is set to the latest `y` index using `_set_cutoff`. This is done pre-`_fit`, which could override the set cut-off.

* catch22 Remake (#864)

* catch22 remake

* catch22 numba imports

* c22pr formatting 1

* c22pr formatting 2

* c22pr formatting 3

* c22pr formatting 4

* catch22 example import fix

* Lower n_estimators for interval example

* almost equal for c22 test

* faster tests and examples

* faster tests and examples 2

* c22 n_jobs and cif alternative classifiers

* code quality

* outlier normalisation

* 1 line

* inclusion in init

* cacheing for numba, pickle fix for cit

* c22f test update

* cif comments

* base estimator change

* update cif tests

* rounding catch22 in cif

* stray prints

* os diff fix attempt

* cq

* test updates

* cq2

* docs and estimators update

* c22f outlier norm false

* suggested pr changes

* move tsf CIT to contrib

* cq1

* c22 private functions

Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>

* Fix fh in imputer method based on in-sample forecasts (#861)

* Changed fh_ins selection.

* Created a function for univariate imputation based on the forecaster.

* Turned _univariate_forecaster() into a function rather than a method. Renamed it and corrected the loop and if statements locations.

* Review, suggestions

* Fix the docstrings by running pydocstyle.

* minor changes

Co-authored-by: mloning <markus.loning.17@ucl.ac.uk>
Co-authored-by: Markus Löning <markus.loning@gmail.com>
Co-authored-by: Martin Walter <mf-walter@web.de>

* Update plot_series to handle pd.Int64 and pd.Range index uniformly (#892)

* check for indices added

* typecasting modified

* check for indices added

* check for indices added

* removed union operation from check_consistent_index_types

* Updated docs to comply with pydocstyle

* comments modified

* Trigger CI Checks

* Add unit test

* Minor fixes after self-review

* Fix tests

Co-authored-by: Sakshi Bhasin <sbhasin@sbhasin-ltmo42k.internal.salesforce.com>
Co-authored-by: mloning <markus.loning.17@ucl.ac.uk>

* Update contributors

* adding fkiraly as codeowner for forecasting base classes (#989)

adding fkiraly as additional codeowner for `forecasting.base` directory

* added Martina's homepage/contact link

* Bump nbqa version (#998)

* Feature/information criteria get_fitted_params (#942)

Adding the available information criteria (AIC, AICc, BIC, HQIC) for various forecasting algorithms.
TBATS and BATS did not have a get_fitted_params method so I added it in the adapter there.

* refactors HCrystallBall Forecaster (#1004)

refactor of HCrystalBallForecaster, see #955

* Refactor Prophet (#1005)

refactor of Prophet forecaster, see #955

* Forecasting tutorial rework (#972)

This pull request contains a re-work of the forecasting tutorial which should make it more structured, complete, and accessible - from what right now looks like a patchwork piece that has organically proliferated.

Main changes:

* re-structuring in chapters - basic workflows, selected estimators, composition (reduction, pipelines)
* expanded explanation of what happens in sections/chapters, added some code for object inspection
* factored out the "pitfalls" in a separate notebook, also expanded - focus is now on use of `sktime` primarily

* refactors PmdArimaAdapter (#1016)

too new forecaster interface, see #955

* refactors tbatAdapter (#1017)

too new forecaster interface, see #955

* Added two new related software packages (#1019)

* Fixing soft dependencies link (#1035)

* Refactors ensembler, online-ensembler-forecaster and descendants (#1015)

as per #955

* Refactor reducer (#1031)

according to #955

* refactors statsmodels, theta forecaster (#1029)

according to #955

* Refactoring Stacking, Multiplexer, Ensembler and TransformedTarget Forecasters (#977)

as per #955

* Refactors polynomial trend forecaster (#1003)

according to #955

* Forecasting tutorial rework - with cell content (#1037)

This is the forecasting tutorial with all cell contents, see #972.

No other changes except cell content in the main forecasting tutorial added.

* Revert "Forecasting tutorial rework - with cell content (#1037)" (#1051)

This reverts commit 19e86b8fcf3b6d96c967e91657f63281f40907aa.

* Multivariate Detrending (#1042)

Fixes #1030 . See also #1038 .

This would be a quick fix to #1030 as it adds functionality for multivariate series to the detrender. As discussed with @mloning I took inspiration from how this was done for the imputer. But I also like the idea to have compositions for multivariate series as it is described in #1038

* Update contributors

* Added tuning tutorial to forecasting example notebook - fkiraly suggestions on top of #1047 (#1053)

Changes to forecasting tutorial notebook by @aiwalter, with contributions by @fkiraly:
new section 3.4.2 on advanced pipeline building, introducing `OptionalPassthrough` for autoML

* default tags in BaseForecaster; added some new tags (#1013)

Adds some default tag values in `BaseForecaster` - these are automatically inherited via `_all_tags`.

Also adds two new tags that I anticipate we will need:
* `"X-y-must-have-same-index"` - do `X` and `y` need to have the same index?
* `"enforce-index-type"` argument to `check_X`/`check_y` if needed - this has to be a tag now, see the problem in #1005 .

I've also added these new tags to the extension template.

This PR also:
* corrects a minor bug: mis-spelled tags in the extension template
* adds the new tags to the allowed tags in tests
* removes the requirement for tags to be boolean

* Clustering (#1049)

* basic functionality

* basic kmeans implementation

* Improved the base code for k means

* Improved the base code for k means

* Renamed misslabeled mixins

* Removed accidental edit

* Delete pyvenv.cfg

* Delete python3.9

* Delete easy_install

* Delete easy_install-3.9

* Delete python3

* Delete python

* Delete pip

* Delete pip3

* Delete pip3.9

* Fixed linting

* Barycenter averaging implemented

* Update .all-contributorsrc

* Update .all-contributorsrc

* Fixed tests

* Fixed warning from sklearn

* Added k medoids and reformatted structure

* basic functionality

* basic kmeans implementation

* Improved the base code for k means

* Improved the base code for k means

* Renamed misslabeled mixins

* Removed accidental edit

* Delete pyvenv.cfg

* Delete python3.9

* Delete easy_install

* Delete easy_install-3.9

* Delete python3

* Delete python

* Delete pip

* Delete pip3

* Delete pip3.9

* Fixed linting

* Barycenter averaging implemented

* Fixed tests

* Update .all-contributorsrc

* Update .all-contributorsrc

* Fixed warning from sklearn

* Added k medoids and reformatted structure

* rebased from main

* first refactor of experiments

* some doc strings added

* removed dependency and reverted accidental delete

* fixed base estimator

* adding all to clustering directory

* clustering examples

* Update clustering_examples.py

* updated defaults

* Updated docstrings

* clustering experiments changes

* add check_X to partition  clusterer

* formatting

* formatting 2

* remove mystery whitespace comment

* Final cluster update before Main (#979)

* Followed feedback given on pr. Additionally added the unit test for BaseCluster as a new estimator

* fixed error in merge

* Suggested changes and improved unit tests (#1000)

* Followed feedback given on pr. Additionally added the unit test for BaseCluster as a new estimator

* fixed error in merge

* Made Markus changes and improved unit tests

* fixed softdepend error

* removed extra

* reverted accidental change

* fixed bug

* Removed softdependency

* Clustering (#1045)

* Reverted everything and added only the files I've changed back in to hopefully fix the pr

* Reverted everything and added only the files I've changed back in to hopefully fix the pr

* Changed name back

* Update .all-contributorsrc

* Clustering small changes (#1054)

* Reverted everything and added only the files I've changed back in to hopefully fix the pr

* Reverted everything and added only the files I've changed back in to hopefully fix the pr

* Changed name back

* Final Markus changes and updated some docs

* fixed linting

* removed upsupport annotations

* softdependencies updated

* softdependencies updated

Co-authored-by: Chris Holder <chrisholder987@hotmail.com>

* Add ThetaLines transformer (#923)

* added empty theta.py file

* Added transform and _theta_transform methods

* Added input checks and examples to the docstr of the class

* added simple tests

* Updated _check_theta function

* Updated _check_theta function and test_theta.py

* Added test and example theta_transform.ipynb

* Updated tests, theta.py and example notebook

* updated _check_theta to return list

* Updated example notebook and transform method

* Fixed kernelspec in ipynb

* Updated _tags and example docstr

* Updated example docstr

* minor clarifications in forecasting extension template preamble (#1069)

* BaseCluster class issues resolved (#1075)

* renamed `BaseCluster` to `BaseClusterer`
* `fit_predict` is now in `BaseClusterer`, deprecated mixin

* sktime-registry - without moving all_estimators (#1067)

This PR moves the following things to a central `sktime` registry:

* scitype strings, estimator base class references and look-up
* tag names, the scitype which the tag is for, expected tag types

The registry also records plain English explanations, so it's runtime inspectable without having to dive through code files and comments in the code.

This should simplify test workflows (e.g., in `_config`), and retrieval workflows such as `all_estimators` or the automated overview of learning algorithm #704, filtered, say, by scitype or tag.

* Forecasters re-factoring: BaseGridSearch, ForecastingGridSearchCV, ForecastingRandomizedSearchCV (#1034)

Part of refactor #955

#### What does this implement/fix? Explain your changes.
- ❗ **added `self.check_is_fitted()` to `BaseForecaster`'s `update_predict` fn**
- changed `fit` to private `_fit`
- made sure `\forecasting\model_selection\tests\test_tune.py` pass
- added `_tags`: used the (conservative) values from the extension template

* Fix side efect on input for Imputer and HampelFilter (#1089)

Fixes side effect on input for Imputer and HampelFilter

* Implementation of signature based methods (#714)

* Base signature commit.

* Added signature classes to soft dependenices + contrib update

* ran black on the signature files

* Added prefer binary flag to esig

* Was missing window + esig tosig error

* Flake8 error

* Updated to esig 0.9.1

* Updated esig version to 0.9.4

* Bump esig

* Base signature commit.

* Bumped esig, rebased

* ran black on the signature files

* Was missing window + esig tosig error

* Flake8 error

* Bumped esig, rebased

* Bumped esig, rebased

* esig modification

* esig 0.9.6 wont install, trying 0.95

* Bump esig

* update jupyter for black

* change docstring test

* change docstring test

* Updated to new branch

* Newline in ipynb that is needed for some reason

* Update sktime/transformations/panel/signature_based/_signature_method.py

Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Update sktime/classification/signature_based/_signature_classifier.py

Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Franz and markus comments

* contrib

* Updated docs

* Fixed bug in signature testing

* Fixed bug in signature testing

* Fixed bug in signature testing

* Oops

* Updated example to new naming

* Fixed case where aug_list is given a string

* Update build_tools/run_examples.sh

Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Fixed sig classifier name

Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Create add_dataset.rst (#970)

* Create add_dataset.rst

Add a section in the developers' guide for contributing new datasets to sktime

* add to developer's guide

* Cleanup metric docstrings and fix bug in _RelativeLossMixin (#999)

* Fix scaled metric docs

* cleanup metric function docstrings

* cleanup metric class docstrings

* Additional metric fun docstring fix

* Make test case use pd.Series

* mean_asymmetric_error examples

* relative_loss examples

* Add metric class examples

* Add Croston's method (#730)

* Add Croston's method

* Implement interface for Croston's Method

* Inherit from _SktimeForecaster

* Shift training loop to fit()

* Adds alpha parameter to predict()

* bug fix

* 1. Add unit test to check against R-package, 2. Add PBS dataset

* remove redundant file

* remove smoothing parameter argument from predict() and change default value to 0.1

* tests smoothing parameters in test_croston.py

* add dataset name to setup.py

* add dataset name to setup.py

* updates docstrings

* add suggested changes

* adds reviewed changes

* bugfix: side effects in Multivariate-Detrending (#1077)

Fixes #1042, removes side effects

* Clustering extension templates, docstrings & get_fitted_params (#1100)

*cleaned up the docstrings in clusterer base class
* extension template for clusterers
* added abstract `get_fitted_params` in the clusterer base class

* added mloning and aiwalter as forecasting/base code owners (#1108)

* Forecasting: base/template docstring fixes, added fit_predict method (#1109)

Some small changes to the forecasting base class and template:

* updates to docstrings, clarifications
* added `fit_predict` method which does what one would "obviously" expect, fitting & forecast in one step.

* Add series annotation and PyOD adapter (#1021)

* Added Annotation Base Class

* Added tests for Base Annotator.

* Added Annotation test utils and MockAnnotator

* Added pytest paramterize for mockannotation test.

* Fixed *args error for _make_args.

* Added super constructor call from MockAnnotator

* raw notes from Franz

* annotation base class updated

* some checks; internal fit/predict assume list

* mock annotator updated to deal with lists

* removed funny printout since it will appear in testing

* removed requirement that annotation dtype is bool, string or int - could be float or anything

* linting

* mock doc

* split annotator in panel and stream

* two versions of mock, panel and stream

* module exports

* tests

* linting

* checks

* BaseStreamAnnotator input checks

* fixed broken export

* allow non-bool tags

* added annotator tags to valid tags

* removed accidental self in self.method calls

* removed panel annotator, renamed stream to series

* linting & stream->series in test string

* renamed one forgotten stream->series

* transformer alias; commented out tags

* added setting for pyod adaptor

* reformatting _config imports

* added pyod soft dependency

* add pyod in build reqs

* bug in init check

* pyod adapter: conversion in the "one feature" case

* typo

* condition should be for dim 1 arrays, actually

* reshape was wrong way round (1 sample not 1 dim)

* renamed pyodannotator

* pyod adaptor estimator is now cloned

* added aliasing for X/Z in annotators for transformer use

* removed Y from transform and aliasing from predict

* fixed docstrings

* fixed eof

* changed ref args to kwargs

* decorator

* init change was not saved

* removed transformer stuff, was not a good idea

* fixed call to unfitted estimator

* added annotation to registry, added tags (commented out)

* Suggestions for PR #1021 (#1093)

* finishing base class

* remove mock annotator

* Fix docstrings

* Fix imports

* Fix updating

* Fix imports

* Fix tags

Co-authored-by: Satya Pattnaik <satyapattnaik76@gmail.com>
Co-authored-by: satya-pattnaik <satyapattnaik76>
Co-authored-by: Satya Pattnaik <satyapattnaik@Satyas-MacBook-Air.local>
Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Add seeding to Minirocket classifier's _fit_multi (#1094)

Co-authored-by: Franz Király <f.kiraly@ucl.ac.uk>
Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>

* Make OptionalPassthrough support multivariate (#1112)

Co-authored-by: Walter <walmar2@emea.corpdir.net>
Co-authored-by: Franz Király <f.kiraly@ucl.ac.uk>

* using proper interface point _all_tags for self-inspection (#1068)

Replaces external interface point `_has_tags` in forecasters inspecting their own tags with the already existing internal interface point `_all_tags`, in two places: the `_set_fh` in `BaseForecaster` and the `TransformedTargetForecaster`.

* ForecastingPipeline for pipelining with exog data (#967)

* Added ForecastingPipeline

* Added ForecastingPipeline

* Fix inverse_transform

* Added tests for ForecastingPipeline

* Removed transform and inverse_transform in ForecastingPipeline

* Added transform and inverse_transform to ForecastingPipeline, but allow only DatFrame

* Fix test, fix docstring example

* Fix test

* Update to new forecasting interface

* Update to new forecasting interface

* Update to new forecasting interface

* Making X mandatory arg for ForecastingPipeline

* Minor fix

* Fix all forecaster tests

* Revert changes in test_all_forecasters.py

* Added NotImplementedError for missing X

* Added NotImplementedError to _predict and _update

* black reformat

* Added passthrough if X is None for ForecastingPipeline

* Added Xt and yt variables

* Fix Xt used before defined issue

* Added _copy function to copy data

* Added _copy function to copy data

* Revert changes in test_pipeline.py

* Added codeownership

* Added ForecastingPipeline to docs

* Removed .copy() of input

* Removed unused function

Co-authored-by: Walter <walmar2@emea.corpdir.net>

* Fix broken _has_tag import in ForecastingPipeline from #967

* forecasting refactor: removing _SktimeForecaster and horizon mixins (#1088)

* removed _sktimeforecaster and horizon mixins

* removing _SktimeForecaster import from _HeterogenousEnsembleForecaster

* needs baseforecaster instead

* replaced mixins and sktimeforecaster in tests

* linting

* changed mloning to github alias

* changing grid search to safe value

* Suggestions for PR #1088 (#1092)

* add attribute delegation

* delegate to internal methods

* Fix Croston forecaster

Co-authored-by: Markus Löning <markus.loning@gmail.com>
Co-authored-by: mloning <markus.loning.17@ucl.ac.uk>

* Fix use of seasonal periodicity in naive model with mean strategy (from PR #917) (#1124)

* BUG GH907: Fix by prepending rather than appending NaN values

* TST GH907: Add tests

* DOC GH907: Add contributor

Co-authored-by: F.N. Claessen <felix@seita.nl>

* TSC base template refactor (#1026)

Refactors the time series classification base template according to the `fit`/`_fit` design in #993.
Adds an extension template for time series classifiers.

* Added orbit as related software (#1128)

* Pairwise transformers, kernels/distances on tabular data and panel data - base class, examples, extension templates (#1071)

This PR contains:

* base classes for a new scitype: pairwise transformers, i.e., distance matrix makers and kernel matrix makers; there are two base classes for two different inputs, tabular (classical distances/kernels) and time series panels (e.g., time warping distances)
* an example tabular distance `ScipyDist` that interfaces the `scipy` distances
* an example panel distance `AggrDist` that uses any tabular distance to create an aggregate sample distance between time series, e.g., the mean Euclidean distance
* extension templates for both scitypes
* basic tests for the general scitype and the concrete estimators added

* release 0.7.0 version up and changelog update

* Enhancement on RISE (#975)

* trial experiment with my settings

* Added writing dataframe as ts file functionality
Added unit testing for writing dataframe

* Address pull request #438 problems

* Follow up to issue #447

add Bibtex reference to the missing classifiers
fix some minor syntax issues in the existing Bibtex
fix indent of some existing references
add link under the references

* add power of 2 padding and slice on ps
allow to control pad function
add sign to allow for inverse fft

* trial for experiment
change interval selection method and implementation
ps transform now round to the nearest power of 2

* correct version for experiments running

* fixed interval length calculation
added 2d version of interval generation
added checks for min and max interval values
remove the use of entire interval at first
make ps use n as the fft length with n = ps_len*2
change order of concatenation of transformed_x
added numba to round and acf functions
added vectorized acf and simplified original acf

* added 2d version of acf and turn ps to 2d

* add fft padding in ps function and changed interval selection method to RISE

* pr changes

* pr changes correct format

* pr changes code quality

* more pr changes code quality

* tidy up experiments.py

* add compatibility for older version of numpy

* update rise unit test expected probability distributions

* add private variables for min and max interval

Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>

* Uploading new Proximity forest version (#733)

* Uploading my Proximity forest version

* Fixing code quality errors

* Moving my implementation to contrib directory

As crew members suggested, I fixed some errors and put the suggested classifier in the contrib/distance_based file. 

As a conclusion of my research, the main reason my implementation tends to be take less time than the original one is the use of the 'dtaidistance' package as a tool for calculating distance that makes it faster.

* Reverting to original Proximity Forest 

My implementation has been moved to `contrib/distance_based` directory

* fixing some code quality errors

* fixing some code quality

* fixing code quality errors

* Updating Proximity Forest Alternative

* Fixing code quality errors

Co-authored-by: Matthew Middlehurst <pfm15hbu@uea.ac.uk>
Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>

* Dictionary based improvements (#1084)

* dictionary refactor start

* both igb for now

* repeat word fix

* tde uses my igb and new gp

* define bits used in init

* trying old dft. this is going to be slower

* typed dict stuff for if future reference

* typed dict and test updates

* linting and tde oob

* code quality

* stray print

* mrseql test

* mrseql test

* tde train estimate revert and prediction saving

* notebook run

* fix long wordlength and alphabet sizes. reformat sfa

* automatic switch of DFT type in sfa and cboss/tde cleanup

* pr suggestions

* sfa tests

* SFA. to self.

Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>

* Update release script (#1135)

* rework of installation guidelines (#1103)

[DOC] rework of installation guidelines

* Update README  (#1024)

Updates to the GitHub `README` landing page:

* updated abstract; refers to community activities too
* index at the start of the page
* tabular overview of supported learning tasks and non-estimator object; links to relevant artefacts such as tutorials, documentation, etc
* clearer vision statement
* extended new contributors section to be more clear on how to contact us or to contribute

On the technical side, this changes the `README` from `rst` to `md` (for full emoji support 😃 )

* added link to distances extension template on README

* revert change in distances -> kernels

* fixed distance/kernels link on landing page

* coherent interface for data_processing (panel conversions) module (#1061)

This PR introduces a clean interface for the panel data types conversions module.

This is fully downwards compatible, but adds the `convert(from, to, scitype)(what)` syntax on top of the `utils.data_processing` module.

This should provide users with an easy way to convert their data into the `sktime` required format for TSC without having to sift through the code to find the right converter utility.

A number of conversions are missing, and some are lossy, so this may be a good start for future "good first issues".

* Add roadmap from dev days workshop (#1145)

Updates the roadmap based on the discussion from the 2021 sktime dev days.

* Fix pydocstyle config (#1149)

* adding example in docstring of KNeighborsTimeSeriesClassifier (#1155)

In the docstring of `KNeighborsTimeSeriesClassifier`: added example and modified description of `weights` in `Parameters`.

* BaseObject and rework of tags system, including dynamic tags (#1134)

Rework of PR #1099 which is in turn a rework of PR #891 (by @RNKuhns), based on discussion recorded in #981 on tags.

Adds a `BaseObject` class which adds dynamic tag functionality. `BaseObject` contains `BaseEstimator` functionality sans `fit` state change related interfaces.
The new functionality replaces the inconsistent mixture of old interface points (`all_tags`, `has_tag`) in all downstream dependencies.

post-PR, tags work like this:
* each class will have "class tags", stored in the content of the class variable `_tags: dict`. Individual class tags are considered to override each other as per normal nested inheritance. E.g., if a parent class and the class have different values for a class tag, the younger class' class tag value counts.
* each object can, in addition, have dynamic tags, in the object variable `_tags_dynamic: dict`. These are expected to be initialized in the constructor, for instance in cases where behaviour described by tags depends on components or hyper-parameters. Dynamic tags are always considered to override the class tags - e.g., if the object's class or a parent class have a static tag of the same name, the dynamic tag's value overrides any class tag's value.

The new tag related methods make use of this inheritance logic are as follows.

class methods:
* `get_class_tags()` collects all class tags and their values and returns them as a `dict` (keys = tag names, values = tag values). It overrides values precisely as described above, in the return dictionary. This replaces the old `_all_tags`.
* `get_class_tag(tag_name, tag_value_default=None)` retrieves the value of an individual tag with name `tag_name`, and optionally allows to specify a default value if the tag is not found. The inheritance/override logic is the same as in `get_class_tags`.

object methods:
* `get_tags()` collects all class tags and dynamic tags and their values and returns them as a `dict` (keys = tag names, values = tag values). It overrides values precisely as described above, in the return dictionary.
* `get_tag(tag_name, tag_value_default=None)` retrieves the value of an individual tag with name `tag_name`, and optionally allows to specify a default value if the tag is not found. The inheritance/override logic is the same as in `get_tags`.
* `set_tags(**tag_dict)` sets dynamic tags according to keys (=tag names) and values (= tag values) in the argument dictionary. It is expected to be used in constructors typically.
* `mirror_tags(estimator, tag_set=None)` is a shorthand version of `set_tags`, where it sets tags according to their `get_tag` values in `estimator`. The tags that are mirrored/copied in this way can be restricted by specifying a list of tag names in `tag_set`, otherwise all tags are copied over. The mirroring is by-reference and not by-value (if the type allows this), but tags are not expected to be mutated.

* add examples in docstrings in classification module (#1164)

* Refine the Docstrings for BOSS Classifiers (#1166)

* Update docstring style

* Refined docstring

* Update See Also and Examples

* Fix examples and add missing docstring info

* Push changes to cBOSS docstrings

* Add annotation ext template (#1151)

* Add annotation ext template

* Minor fixes

* Fix docstrings

* unit test for absence of side effects in estimator methods (#1078)

This implements #1073, a test for estimator methods not having side effects.

New method is `check_methods_have_no_side_effects` in `estimator_checks`.

It also adds a utility function `deep_equals` for tests, which tests for equality of deep list/tuple constructs containing pandas, numpy, and/or primitives.

* ForecastingHorizon is_relative detection on construction from index type (#1169)

This PR makes minor changes to `ForecastingHorizon` construction behaviour, to address the user frustration highlighted by @Flix6x in #1167. It also fixes all the docstrings.

Previously, regarding `ForecastingHorizon` construction behaviour:
* the default `is_relative` in construction of `ForecastingHorizon` was `True`, which caused an error to be thrown when constructing a `ForecastingHorizon` with supported absolute index types that were not also relative types, like `pd.DatetimeIndex`.
* when an index of such a type was passed to the `fh` argument of `fit` or `predict`, the same error would be thrown, since all indices were converted to `ForeceastingHorizon` with the default setting `is_relative=True`.

The new construction behaviour is as follows:
* the constructor allows the `is_relative` argument to be `None` - in this case the value of the `is_relative` argument is inferred: if the index is a supported relative index, it's `is_relative=True`. Otherwise, if it's a supported absolute index, it's `is_relative=False`.
* This covers the case of `pd.DatetimeIndex`, which now can be passed to `fit` and `predict` without the estimator complaining.

* bugfix set_tags (#1179)

Bugfix to `BaseObject.clone_tags`: `set_tags` was called incorrectly from `clone_tags` without the kwargs symbol.

* Add estimator overview to docs (#1138)

* Reformatted done

* Added learning algorithm creating dynamic estimator overview with sphinx generated docs

* Added html files

* fixed minor changes

* Updated algo_list.py

* Transferred to docs file and updated them

* Refactoring files

* Refactoring files

* updated the extension to estimator_overview_table.html

* possible solution

* update dependencies

* update dependencies

* update dependencies

* update dependencies

* more changes

* Add JavaScript and HTML to filter estimator overview (#1153)

* adds a searchable estimator overview table in markdown 
* the table is generated automatically every time the docs are build based on the source code
* minor updates to the docs deployment

* Fix docstrings

Co-authored-by: afzal442 <afzal442@gmail.com>
Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>
Co-authored-by: Juan Luis Cano Rodríguez <hello@juanlu.space>

* Add ExponentTransformer and SqrtTransformer (#1127)

* Added SqrtTransformer to docs

* Added sqrt.py to CODEOWNERS

* Refactored to PowerTransformer

* Fixed SqrtTransformer __init__

* Unit tests for raised errors

* Fixed SqrtTransformer docstring

* Renamed PowerTransformer to ExponentTransformer

* Add funding (#1173)

* add funding

* Change wording

* remove mloning

* fix wording

* Forecasting support for multivariate y and multiple input/output types - working prototype (#980)

This PR introduces multiple input/output type support for `X` and `y` in forecasters, including generic support for multivariate `y`, i.e., `pd.DataFrame`, and the possibility to pass `np.ndarray` and `pd.Series` to either argument. This is loosely based on the design in [STEP no.5](https://github.com/sktime/enhancement-proposals/tree/main/steps/05_scitype_based_IO_checks) and subsequent discussions with @mloning around #1074.

The key ingredients are:

* converters parameterized by from, to, and as - in the `convertIO`module. Besides the obvious conversion functionality, the converters can be given access to a dictionary via reference in the `store` argument, where information for inverting lossy conversions (like from `pd.DataFrame` to `np.array`) can be stored
* a new tay `y:scitype` which can be `"univariate"`, `"multivariate"`, or `"both"`, indicating what type of `y` are supported (multivariate here means 2 dims or more)
* new tags which encode the type of `y` and `X` that the private `_fit`, `_predict`, and `_update` assume internally - for now, it's just one type and not a list of compatible types
* some logic in the public "plumbing" area of `fit`, `predict`, `update`, which converts inputs to the public layer to the desired input of the logic layer and back
* expanding tests and checks that ensure that errors are raised when the wrong types are passed, and changes that ensure the new allowe inputs such as `pd.DataFrame` are allowed rather than blocked by the checks

* forgot X_inner_mtype in extension template

* bugfix - pd.DataFrame->pd.Series conversion missing (#1187)

The pd.DataFrame->pd.Series conversion was incorrectly missing from `_convertIO` (from #980), this is fixed now.

Merging quickly since urgent bugfix

* Minor update to See Also of BOSS Docstrings (#1172)

* Update docstring style

* Refined docstring

* Update See Also and Examples

* Fix examples and add missing docstring info

* Push changes to cBOSS docstrings

* Fix See Also links in docstring

Co-authored-by: Matthew Middlehurst <pfm15hbu@uea.ac.uk>

* STSF test fix (#1170)

* stsf fix

* docs

Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>

* change to classification experiments (#1137)

remove the need for cython to run classification_experiments, set up example problem from datasets/data/UnitTest

* Add annotation to docs (#1156)

* Complete docstring of EnsembleForecaster  (#1165)

* Refactored class docstr, added example

* Deleted blank lines after fn docstrings

* Fixed docstring example

* bugfix np.ndarray type typo in convertio (#1191)

* Registry lookup - all_estimators refactor, and new all_tags (#1196)

This PR consolidates registry lookup in the `registry` module:

* `all_estimators` is moved to the `registry`, with all dependencies referencing the new location
* `all_estimators` has a new argument `filter_tags`, which allows to filter by tags (incl scitypes, capabilities, etc); order of arguments has been changed so filtering is first, output format args last
* new lookup function `all_tags`, which allows to lookup tags, possibly filtered by estimator type
* both lookups have a new argument `as_dataframe`, which allows for pretty printing in jupyter notebooks, useful for the standard user archetype

* Transform classifiers (#1180)

* transform classifiers p1

* transform classifiers p2

* transform classifiers notebooks

* signature config fix

* dependencies, api and test docs

* feature based docs p1

* mp fix

* tsfresh docs

* docs cont.

* mp test config fix

* more doc fixes

* it has to be one 1 line apparently?

* pycharm editing some random imports

* reintroduce deprecated files. matrix profile estimator change

* soft dependencies and matrix profile config

* explicit deprecation comments

* default estimator fix

* cant even get 0.9

* [DOC] Adding table of content in forecasting tutorial (#1200)

* Data types module, collating conversions, mtype inference, checks, register (#1201)

This PR collates data type and conversion related concerns in its own module, `datatypes`.

The functionality moved to `datatypes`:

* series conversion functionality formerly in `forecasting.base.convertIO`
* panel conversion functionality formerly in `utils.data_processing._panel`

All references to the old locations have been replaced by references to the new locations.

New features enabled by this move:

* there is just a single `convert` function, for both `Series` and `Panel` scitype
* nested, extensible module structure which is by scitype, e.g., `panel` and `series`

New functionality, partially implemented for extension:

* register of mtypes with explanation, to be complemented by docs
* central fixture generation function `get_example` which produces fixtures of the same "scientific content" but a specific mtype, for external tests, and bulk testing of conversion or mtype inference functionality

* Format setup files (#1236)

* Set up fails the tests

* Change strings from '' to ""

* setup docstring, since Im here

* docs

* god I hate docstrings

* Update setup.py

* removed format check from index test (#1193)

This addresses problem no.3 in #1192 - removing a data frame format check from the test on "indices are equal" which is off-topic in the test, and causes all combined uni/multivariate forecasters (with `scitype:y = both`) to fail.

* prediction interval NotImplementedError moved to BaseForecaster (#1195)

This addresses problem no.1 in #1192, by:

* moving `NotImplementedError` raised for prediction intervals to `BaseForecaster`, dependent on a new boolean capability tag `"capability:pred_int"`
* adding the tag with value `True` to all forecasters which have the capability to return prediction intervals; added it in `clone_tags` of tuning
* updating the test `test_predict_pred_interval` so it checks both for whether prediction intervals are returned, or `NotImplementedError` is raised, explicitly dependent on `"capability:pred_int"`

* Clustering experiments (#1221)

* minor change to tidy up file writing format

* format  1

* format 2

* format 3

* format 4

* more on write_results_to_uea_format

* format 4

* experiments redesign

* moving clustering experiments out of contrib

* formatting 1

* format 2

* format 3

* format 4

* format 5

feel my pain

* Format 5

* docstrings 1

* docstring 2

* docstring 3

* docstring 5-ish

* docstring 5

* tidy up load_and_run docstring

* import error in elastic_ensembl_from_file

* try a rebuild

* make some methods "private"

* Update basic_benchmarking.py

* move stratified sampling into new file "sampling"

* docstrings

* benchmarking docstring

* adding unit tests for experiments

* docstrings

* Update test_experiments.py

* update

* Update test_experiments.py

* docstrings, always docstrings

* docstrings

* remove test harness, add Example in docstring

* refactor resampleID

* fix tests

* formatting

* Examples formatting

* remmove Experiments to see if it passes

* added a root directory to setup.py to help make loading simpler

* style update for setup.py to pass tests

* Update setup.py

* Update setup.py

* Update experiments.py

* Update test_experiments.py

* Update setup.py

* give up on Experiments for now

* will it ever end? :)

* seems not

* spaces, too many spaces

* Update experiments.py

* Update experiments.py

* correct import set_classifier

* Update experiments.py

* back to docstrings

* docstrings, gotta love em

* Update test_experiments.py

* Update test_experiments.py

* Update test_experiments.py

* leave classification for now, to sort out soft dependencies

* formatting 5 billion

* my IDE disagrees, but nevermind

* It has me doing docstrings for setup.py now ....

* Another can of worms accidentally opened

* docstringsdocstringsdocstrings

* why does pycharm auto indent incorrectly?

* I'm really confident this time ....

* dashed on the cliffs of unused import

* I am merging this bugger as soon as it passes!

* so the linting makes changes that break the setup. Marvellous. Reverting to the original

* Update test_experiments.py

* Update test_experiments.py

* more setup woes

* revert to using data loading functions

relative paths do not work with test functions, nor does the global root

* Update base.py

* Update test_experiments.py

* Update test_experiments.py

* Update test_experiments.py

* Update test_experiments.py

* Update test_experiments.py

* Update test_experiments.py

* Update base.py

* Update base.py

* debug test to see if baked in are being loaded

* Update test_experiments.py

* Update base.py

* Update base.py

* Update base.py

* debug push 2, with ItalyPowerDemand

* Update base.py

* Update base.py

* Update base.py

* Update test_experiments.py

* Update test_experiments.py

* make sure it works without unit tests

* Update test_experiments.py

* Update test_experiments.py

* Update feature_request.md (#1242)

* Update feature_request.md

* Update feature_request.md

* Update feature_request.md

* Multivariate moving cutoff formatting (#1213)

This PR upgrades `update_predict` to handle multivariate predictions by allowing `_format_moving_cutoff_predictions` to format lists of data frames.

The return in the multivariate case is a 2D dataframe with column multi-index (column, variable), otherwise in analogy to the univariate case.

This also addresses one of the post-multivariate PR issues from #1192.

* Fix appveyor CI (#1253)

* Data io (#1248)

* remove direct reference to datasets/base

* refactor datasets/base.py -> datasets/data_io.py

* Update add_dataset.rst

* Add UnitTest to the included datasets

* format and docs

* Update test_muse.py

* tedium

* direct distances.base reference in notebook

* pedantry

* no idea on this one

* Update dictionary_based_classification.ipynb

* getting a bit surreal now

* forgot the init for load_unit_test

* correct notebook, make data_io private

* end of file weirdness

* Update interval_based_classification.ipynb

* Update minirocket.ipynb

* Update rocket.ipynb

* end of file nonsense

* Update test_time_series_neighbors.py

* Update test_time_series_neighbors.py

* [DOC] add conda-forge maxdep recipe to installation docs and readme (#1226)

* Update contributors (#1243)

* bug fix in tutorial documentation of univariate time series classification. (#1140)

* Update 02_classification_univariate.ipynb

* Update 02_classification_univariate.ipynb

* Update 02_classification_univariate.ipynb

* quick fix

changing  #renaming _slope to slope. -> # renaming _slope to slope.

#replacing ' from "

Co-authored-by: Matthew Middlehurst <pfm15hbu@uea.ac.uk>

* removing tests for data downloader dependent on third party website, change in test dataset for test_time_series_neighbors (#1258)

* tests

* Update test_data_loaders.py

* Update test_data_loaders.py

* purely to rerun tests

* Added n_best_forecasters to grid searches (#1139)

* Added n_best_forecasters to grid searches

* Fix docstring example

* Fix cv_results

* Added n_best_scores to grid search

* Added n_best_scores to grid search

* Added n_best_scores to grid search

* Fix pydocstyle

* Fix pydocstyle.

* Fix docstring

Co-authored-by: Walter <walmar2@emea.corpdir.net>
Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Classification experiments (#1260)

* refactor classification experiments

* remove testing files

* Update test_experiments.py

* docs

* docs

* Update classification_experiments.py

* Update experiments.py

* Update experiments.py

* Delete basic_benchmarking.py

* Update classification_experiments.py

* classification experiments fixes and documentation

* text fix

Co-authored-by: Matthew Middlehurst <pfm15hbu@uea.ac.uk>

* [DOC] Troubleshooting for C compiler failure on pytest (#1262)

* Adding TrendForecaster in forecasting/trend.py (#1209)

introduces a class named `TrendForecaster` that doesn't have the polynomial features in the pipeline, while leaving the `PolynomialTrendForecaster` unchanged.

* Fixed Binder Dockerfile (#1266)

* [DOC] adding binder link to readme (landing page) (#1282)

* docstring fix for distances/series extension templates (#1256)

* Add binder badge to README.md (#1285)

* Add ColumnEnsemble Forecaster (#1082)

* Created empty forecasting/column_ensemble.py file

* Added first draft and output shape tests

* Added aggfunc parameter, _aggregate function

* Added input data and parameter checks

* Added input checks, estimator test parameters to _config.py

* Fixed typos in docstrings

* Fixed docstrings after doc-quality check

* Added negative tests and different aggfunc test

* Fixed docstr example

* Deleted overridden  and aggregation funcitonality

* Updated y_pred.index, deleted aggregation functions tests

* Reverted files from merge conflict

* Added internal _forecasters list

* Updated the docstring for _forecasters

* Removed get and set_params, updated docstr

* Updated example dataset

* Added back get_params and set_params

* Update docs, codeowners; returned df has the same column names

* Move index checks to _check_forecasters

* Updated tag and author strings

* set dynamic requires-fh-in-fit tag

* Added slack and google calendar to README (#1283)

* Added slack and google calendar

* Fix merge conflict in REAMDE

* Update README.md

Co-authored-by: Walter <walmar2@emea.corpdir.net>

* Fix minor silent issues in TransformedTargetForecaster (#845)

* Fix minor silent issues

* Apply inverse transform to pred_int

* Fix pydocstyle

Co-authored-by: Walter <walmar2@emea.corpdir.net>

* Add ColumnwiseTransformer (multivariate compositor for series-to-series transformer) (#1044)

* first draft of multivariate compositor

* change syntax

* added transformer to test_config; introduced multivariate-only-tag; adjusted tests_all_transformers.py to handle multivariate-only data; minor improvements to multivariate_compositor.py

* remove apply function for now; next step: add functionality for transformer.update()

* Update sktime/transformations/series/multivariate_compositor.py

Co-authored-by: Martin Walter <mf-walter@web.de>

* Update sktime/transformations/series/multivariate_compositor.py

Co-authored-by: Martin Walter <mf-walter@web.de>

* Update sktime/transformations/series/multivariate_compositor.py

Co-authored-by: Martin Walter <mf-walter@web.de>

* revert tests, as we also accept univariate series

* make transformer accept pd.Series; implement suggestions from aiwalter, add myself to list of contributors

* reformat test_config.py

* Update sktime/transformations/series/compose.py

Co-authored-by: Martin Walter <mf-walter@web.de>

* improve docstrings

* Update compose.py

* add example and decorator

* reformat config.py

* Update sktime/transformations/series/compose.py

Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Update sktime/transformations/series/compose.py

Co-authored-by: Markus Löning <markus.loning@gmail.com>

* make additional _check_columns and _revert_to_series function; get rid of if/else for mulivariate/univariate; add check for whether list is passed; rename to ColumnwiseTransformer

* add transformer to api_reference

* remove space

Co-authored-by: Martin Walter <mf-walter@web.de>

* directly import load_longley

Co-authored-by: Martin Walter <mf-walter@web.de>

* remove empty line

Co-authored-by: Martin Walter <mf-walter@web.de>

* minor change to OptionalPassthrough

Co-authored-by: Martin Walter <mf-walter@web.de>

* directly import load_longley

Co-authored-by: Martin Walter <mf-walter@web.de>

* make docstrings pydocstyle compliant, move _attributes to fit

* z_name cannot be an attribute of the transformer as it changes in transform

* make functions standalone, change z[0] to z.squeeze(columns), add helper function to test whether it's a univariate series

Co-authored-by: Martin Walter <mf-walter@web.de>
Co-authored-by: Markus Löning <markus.loning@gmail.com>

* Update documentation backend and reduce warnings in doc creation (#1199) (#1205)

Update in documentation back-end, all docstrings fixed for pydocstype.

#### Reference issues/PR
Fixes #1181 
Fixes #1230 
Partly addresses #1152 

#### What does this implement/fix? Explain your changes.
* replaces `readthedocs-theme` with `pydata-sphinx-theme`
* restructures content in line with new theme
* adds new getting started page
* fixes autodoc display of summaries of classes and functions
* fixes sphinx build and warnings
* fixes docstrings and numpydoc warnings
* adds tutorial notebook thumbnail gallery
* adds links to Twitter, Discord and GitHub
* fixes wrong auto-generation directory in sphinx build
* updates version of sphinx and related packages
* integrates `myst-parser` replacing `m2r2`
* update `.gitignore` to exclude auto-generated documentation files
* adds glossary
*  removes full module path from page titles in API reference
* removes module names from table of content in API reference using more readable names (e.g. instead of "sktime.forecasting" we now simply use "Forecasting")
* adds page for adding estimators using extension templates
* adds page for enhancement proposals

#### Does your contribution introduce a new dependency? If yes, which one?
For generating docs only: 
* `pydata-sphinx-theme` (replaces `readthedocs` theme)
* `myst-parser` (replaces `m2r2`)
* `sphinx-gallery`

* Time series classifiers refactor: rocket classifier (#1239)

#1146 TSC refactor for `ROCKETEstimator`

Refactor `ROCKETEstimator` class according to the [new extension template](https://github.com/alan-turing-institute/sktime/blob/main/extension_templates/classification.py).
- fit and predict turned private
- `capabilities` class attribute renamed to `_tags` class attribute
- capabilities names prefixed with `capability:` prefix
- new tags added to ESTIMATOR_TAG_REGISTER

* Add content to documentation guide for use in docsprint (#1297)

* Tweak doc section in reviewer guide

* Add content to documentation guide

* forecasting: fixing test_y_invalid_type_raises_error for strictly multivariate forecasters (#1286)

* Extend aggregation functionality in EnsembleForecaster (#1190)

* Added _aggregate and _check_aggfunc functions

* Made _check_aggfunc standalone, moved index setting to _predict

* Moved _update to _HeterogenousEnsemble, added weights for average aggfunc

* Added weights to other functions

* Fixed y_pred index

* Added weighted_geometric_mean to aggfuncs

* Added wrapper fns, refactored _aggregate and _check_aggfunc

* Change wrapper fns' signature to resemble numpy's

* Add a simple test

* Added tests for unweighted functions

* Added weighted aggfunc tests, updated wrapper fns

* Updated module docstr, commented out _update in _HeterogenousEnsemble

* _update method back in EnsembleForecaster

* Roll back forecasters parameter in unit tests

* Changed back forecasters parameter in unit tests, moved _update to HeterogenousEnsemble

* Added pytest skip for weighted gmean if python version is <3.7

* Update _ensemble.py

* Updated _update fn

Co-authored-by: Martin Walter <mf-walter@web.de>

* Critical Difference Diagrams (#1277)

* comment out debugging example

* fix docstrings

Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>

* Update _data_io.py (#1308)

* Update _data_io.py

* Update _data_io.py

* Update _data_io.py

* Update _data_io.py

* Update _data_io.py

* Tsc_Refactoring _cboss (#1295)

* Converting all public methods in _cboss to private

* Refactoring in base,__cboss, and tags files

* Removed self._is_fitted = True from the private fit method, comma added in _tags file

* Removing duplicate tags

* removing extra space

Co-authored-by: Franz Király <f.kiraly@ucl.ac.uk>
Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>

* Fixed typo in API Reference section header (#1324)

* Adds showcase page for community handlers

* Added the reference into about.rst

Signed-off-by: afzal442 <afzal442@gmail.com>

* Adds sphinx-issues extension to conf.py and modifies showcase file

* modifies conf.py

Signed-off-by: afzal442 <afzal442@gmail.com>

* refactores the files

* updates the branch

* Modifies the link

* Adds short desc and minor changes

Co-authored-by: Matthew Middlehurst <pfm15hbu@uea.ac.uk>
Co-authored-by: Tony Bagnall <ajb@uea.ac.uk>
Co-authored-by: Markus Löning <markus.loning@gmail.com>
Co-authored-by: RNKuhns <RNKuhns@gmail.com>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Martin Walter <mf-walter@web.de>
Co-authored-by: Franz Király <f.kiraly@ucl.ac.uk>
Co-authored-by: Oleksii Kachaiev <kachayev@gmail.com>
Co-authored-by: Oleksii Kachaiev <okachaiev@riotgames.com>
Co-authored-by: mloning <markus.loning.17@ucl.ac.uk>
Co-authored-by: Juliana <jul.ramoos@gmail.com>
Co-authored-by: Drishti Bhasin <56479884+Dbhasin1@users.noreply.github.com>
Co-authored-by: Sakshi Bhasin <sbhasin@sbhasin-ltmo42k.internal.salesforce.com>
Co-authored-by: Marco Edward Gorelli <marcogorelli@protonmail.com>
Co-authored-by: Leonidas Tsaprounis <64217214+ltsaprounis@users.noreply.github.com>
Co-authored-by: Taiwo Owoseni <thayeylolu@users.noreply.github.com>
Co-authored-by: Lovkush <lovkush@gmail.com>
Co-authored-by: Svea Marie Meyer <46671894+SveaMeyer13@users.noreply.github.com>
Co-authored-by: Chris Holder <chrisholder987@hotmail.com>
Co-authored-by: Guzal Bulatova <73598322+GuzalBulatova@users.noreply.github.com>
Co-authored-by: James Morrill <32545677+jambo6@users.noreply.github.com>
Co-authored-by: Riya Elizabeth John <55790848+Riyabelle25@users.noreply.github.com>
Co-authored-by: Satya Pattnaik <satyapattnaik76@gmail.com>
Co-authored-by: Satya Pattnaik <satyapattnaik@Satyas-MacBook-Air.local>
Co-authored-by: Walter <walmar2@emea.corpdir.net>
Co-authored-by: F.N. Claessen <felix@seita.nl>
Co-authored-by: Jason Pong <33785383+whackteachers@users.noreply.github.com>
Co-authored-by: Morad :) <moradabou1996@gmail.com>
Co-authored-by: ltoniazzi <61414566+ltoniazzi@users.noreply.github.com>
Co-authored-by: Juan Luis Cano Rodríguez <hello@juanlu.space>
Co-authored-by: Ahmed Bilal <74570044+bilal-196@users.noreply.github.com>
Co-authored-by: BINAYKUMAR943 <38756834+BINAYKUMAR943@users.noreply.github.com>
Co-authored-by: tensorflow-as-tf <51345718+tensorflow-as-tf@users.noreply.github.com>
Co-authored-by: Corvin Paul <corvin.paul@outlook.com>
Co-authored-by: Viktor Dremov <32140716+victordremov@users.noreply.github.com>
Co-authored-by: AreloTanoh <87912196+AreloTanoh@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants