Description
To keep xeofs organized as we add more models in the future, I suggest we adjust the API with the next major release as follows:
- Single-set models in
xeofs.single
:EOF
,OPA
,GWPCA
,EOFRotator
,HilbertEOF
, etc. - Cross-set models in
xeofs.cross
:CCA
(for 2 datasets),MCA
,RDA
etc., along with all the corresponding rotators, complex, and Hilbert models. - Multi-set models in
xeofs.multi
:CCA
(for more than two datasets) and possibly other models in the future, e.g. Common EOF analysis.
These models would each be based on different base classes, with nothing in common except the _BaseModel
class for serialization and computing. This structure is already reflected in our documentation.
One specific scenario where this would help is with CCA
. Right now, xeofs has a multi-set CCA
implementation. With the current CPCCA
class, it would be easy to offer a cross-set version of CCA with a more extensive API. The algorithm for cross-CCA is much simpler than for multi-CCA, where I'm still struggling to apply xr.apply_ufunc
properly due to the need to explicitly know the number of input_core_dims
and output_core_dims
- which we generally don't know in advance for multi-set models. I'm sure there's a solution, but it might take time to find one. By splitting the namespaces, we could simply offer two versions: cross.CCA
and multi.CCA
, with different APIs depending on what we can ultimately implement for the multi-version.
Does this make sense to you @slevang ? Any concerns or thoughts on this?