Rename CVSplit to ValidationSplit #752

timokau · 2021-04-06T12:49:21Z

Hi! I was a bit surprised by the name of CVSplit. The name suggests that it is responsible for cross-validation, but the documentation reveals that it only trains and validates on one split. That doesn't match my understanding of cross-validation. It could technically be seen as a special case of cross-validation, but I think it would be easier to understand if it was just called "validation".

This was previously discussed in #539. There are good reasons for the current behavior, and I am not proposing to change it. I think it would be an improvement to rename the class from CVSplit to ValidationSplit though. That name would likely lead to less surprises.

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2021-04-06T19:46:29Z

Hi Timo, I agree that the CV part is used in a very loose sense. IIRC the main reason to name it this way was because it is compatible with sklearn cross validators. But I agree that the name can be confusing, I would actually go with ValidSplit (or TrainSplit, as the argument name). As they say, naming things is hard.

If you want to, you could provide a PR to implement this. It would require to deprecate the old CVSplit.

thecaffeinedev · 2021-10-19T09:18:35Z

I can make PR changing CVSplit to ValidSplit. We need to put a deprecation warning also right?

BenjaminBossan · 2021-10-20T22:13:19Z

Thanks.

Yes, the idea would be to replace CVSplit everywhere in the code base but leave a dummy CVSplit there which would just return ValidSplit. That dummy function should make a deprecation warning, telling the user what to change and when CVSplit will be removed for good (say, 2 versions in the future).

thecaffeinedev · 2021-10-21T07:49:08Z

Thanks. I have made a PR. Please check it out and let me know if anything is needed @BenjaminBossan

timokau · 2021-10-23T09:35:36Z

Sorry for not getting back to this. Thanks @thecaffeinedev :)

We are happy to announce the new skorch 0.11 release: Two basic but very useful features have been added to our collection of callbacks. First, by setting `load_best=True` on the [`Checkpoint` callback](https://skorch.readthedocs.io/en/latest/callbacks.html#skorch.callbacks.Checkpoint), the snapshot of the network with the best score will be loaded automatically when training ends. Second, we added a callback [`InputShapeSetter`](https://skorch.readthedocs.io/en/latest/callbacks.html#skorch.callbacks.InputShapeSetter) that automatically adjusts your input layer to have the size of your input data (useful e.g. when that size is not known beforehand). When it comes to integrations, the [`MlflowLogger`](https://skorch.readthedocs.io/en/latest/callbacks.html#skorch.callbacks.MlflowLogger) now allows to automatically log to [MLflow](https://mlflow.org/). Thanks to a contributor, some regressions in `net.history` have been fixed and it even runs faster now. On top of that, skorch now offers a new module, `skorch.probabilistic`. It contains new classes to work with **Gaussian Processes** using the familiar skorch API. This is made possible by the fantastic [GPyTorch](https://github.com/cornellius-gp/gpytorch) library, which skorch uses for this. So if you want to get started with Gaussian Processes in skorch, check out the [documentation](https://skorch.readthedocs.io/en/latest/user/probabilistic.html) and this [notebook](https://nbviewer.org/github/skorch-dev/skorch/blob/master/notebooks/Gaussian_Processes.ipynb). Since we're still learning, it's possible that we will change the API in the future, so please be aware of that. Morever, we introduced some changes to make skorch more customizable. First of all, we changed the signature of some methods so that they no longer assume the dataset to always return exactly 2 values. This way, it's easier to work with custom datasets that return e.g. 3 values. Normal users should not notice any difference, but if you often create custom nets, take a look at the [migration guide](https://skorch.readthedocs.io/en/latest/user/FAQ.html#migration-from-0-10-to-0-11). And finally, we made a change to how custom modules, criteria, and optimizers are handled. They are now "first class citizens" in skorch land, which means: If you add a second module to your custom net, it is treated exactly the same as the normal module. E.g., skorch takes care of moving it to CUDA if needed and of switching it to train or eval mode. This way, customizing your networks architectures with skorch is easier than ever. Check the [docs](https://skorch.readthedocs.io/en/latest/user/customization.html#initialization-and-custom-modules) for more details. Since these are some big changes, it's possible that you encounter issues. If that's the case, please check our [issue](https://github.com/skorch-dev/skorch/issues) page or create a new one. As always, this release was made possible by outside contributors. Many thanks to: - Autumnii - Cebtenzzre - Charles Cabergs - Immanuel Bayer - Jake Gardner - Matthias Pfenninger - Prabhat Kumar Sahu Find below the list of all changes: Added - Added `load_best` attribute to `Checkpoint` callback to automatically load state of the best result at the end of training - Added a `get_all_learnable_params` method to retrieve the named parameters of all PyTorch modules defined on the net, including of criteria if applicable - Added `MlflowLogger` callback for logging to Mlflow (#769) - Added `InputShapeSetter` callback for automatically setting the input dimension of the PyTorch module - Added a new module to support Gaussian Processes through [GPyTorch](https://gpytorch.ai/). To learn more about it, read the [GP documentation](https://skorch.readthedocs.io/en/latest/user/probabilistic.html) or take a look at the [GP notebook](https://nbviewer.jupyter.org/github/skorch-dev/skorch/blob/master/notebooks/Gaussian_Processes.ipynb). This feature is experimental, i.e. the API could be changed in the future in a backwards incompatible way (#782) Changed - Changed the signature of `validation_step`, `train_step_single`, `train_step`, `evaluation_step`, `on_batch_begin`, and `on_batch_end` such that instead of receiving `X` and `y`, they receive the whole batch; this makes it easier to deal with datasets that don't strictly return an `(X, y)` tuple, which is true for quite a few PyTorch datasets; please refer to the [migration guide](https://skorch.readthedocs.io/en/latest/user/FAQ.html#migration-from-0-10-to-0-11) if you encounter problems (#699) - Checking of arguments to `NeuralNet` is now during `.initialize()`, not during `__init__`, to avoid raising false positives for yet unknown module or optimizer attributes - Modules, criteria, and optimizers that are added to a net by the user are now first class: skorch takes care of setting train/eval mode, moving to the indicated device, and updating all learnable parameters during training (check the [docs](https://skorch.readthedocs.io/en/latest/user/customization.html#initialization-and-custom-modules) for more details, #751) - `CVSplit` is renamed to `ValidSplit` to avoid confusion (#752) Fixed - Fixed a few bugs in the `net.history` implementation (#776) - Fixed a bug in `TrainEndCheckpoint` that prevented it from being unpickled (#773)

BenjaminBossan added the good first issue label Jun 6, 2021

BenjaminBossan mentioned this issue Oct 31, 2021

Preparation for release 0.11.0 #807

Merged

BenjaminBossan closed this as completed Aug 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rename CVSplit to ValidationSplit #752

Rename CVSplit to ValidationSplit #752

timokau commented Apr 6, 2021

BenjaminBossan commented Apr 6, 2021 •

edited

Loading

thecaffeinedev commented Oct 19, 2021

BenjaminBossan commented Oct 20, 2021

thecaffeinedev commented Oct 21, 2021

timokau commented Oct 23, 2021

Rename CVSplit to ValidationSplit #752

Rename CVSplit to ValidationSplit #752

Comments

timokau commented Apr 6, 2021

BenjaminBossan commented Apr 6, 2021 • edited Loading

thecaffeinedev commented Oct 19, 2021

BenjaminBossan commented Oct 20, 2021

thecaffeinedev commented Oct 21, 2021

timokau commented Oct 23, 2021

BenjaminBossan commented Apr 6, 2021 •

edited

Loading