-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/two way scaling #104
base: develop
Are you sure you want to change the base?
Changes from 1 commit
9e50115
2f74e1b
f2a1b20
a30bc8e
ed6fd34
47c87cb
2dcde0a
dd92d80
a754f44
4a3c038
cc1d8d3
16b1116
a58383a
5904f60
a17a530
e5395bd
87ee167
a2940df
238f393
34dc936
7869659
9cbb212
a8e980f
e864e72
7f86bb3
748fe33
d2800fc
7c37030
1757216
17806e8
f1f682e
eb8c54b
070c017
4f8267e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -88,8 +88,8 @@ def twoway_standardize(X, axis=0, with_mean=True, with_std=True, copy=True, max_ | |
|
||
class TwoWayStandardScaler(BaseEstimator, TransformerMixin): | ||
"""Standardize features by removing the mean and scaling to unit variance | ||
in both row and column dimensions. | ||
This is modeled after StandardScaler in scikit-learn. | ||
in both row and column dimensions. | ||
This class is modeled after StandardScaler in scikit-learn. | ||
Read more in the :ref:`User Guide <preprocessing_scaler>`. | ||
Parameters | ||
---------- | ||
|
@@ -123,24 +123,22 @@ class TwoWayStandardScaler(BaseEstimator, TransformerMixin): | |
new calls to fit, but increments across ``partial_fit`` calls. | ||
Examples | ||
-------- | ||
>>> from sklearn.preprocessing import StandardScaler | ||
>>> from inverse_covariance.clean import TwoWayStandardScaler | ||
>>> | ||
>>> data = [[0, 0], [0, 0], [1, 1], [1, 1]] | ||
>>> data = [[1, 0], [1, 0], [2, 1], [2, 1]] | ||
>>> scaler = StandardScaler() | ||
>>> print(scaler.fit(data)) | ||
StandardScaler(copy=True, with_mean=True, with_std=True) | ||
>>> print(scaler.mean_) | ||
[ 0.5 0.5] | ||
[ 3.0 0.5] | ||
>>> print(scaler.transform(data)) | ||
[[-1. -1.] | ||
[-1. -1.] | ||
[ 1. 1.] | ||
[ 1. 1.]] | ||
>>> print(scaler.transform([[2, 2]])) | ||
[[ 3. 3.]] | ||
See also | ||
-------- | ||
scale: Equivalent function without the estimator API. | ||
twoway_standardize: Equivalent function without the estimator API. | ||
:class:`sklearn.preprocessing.StandardScaler` | ||
:class:`sklearn.decomposition.PCA` | ||
Further removes the linear correlation across features with 'whiten=True'. | ||
|
@@ -151,42 +149,31 @@ class TwoWayStandardScaler(BaseEstimator, TransformerMixin): | |
""" # noqa | ||
|
||
def __init__(self, copy=True, with_mean=True, with_std=True): | ||
self.with_mean = with_mean | ||
"""Unlike StandardScaler, with_mean is always set to True, to ensure | ||
that two-way standardization is always performed with centering. The | ||
argument `with_mean` is retained for the sake of model API compatibility. | ||
""" | ||
self.with_mean = True | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
self.with_std = with_std | ||
self.copy = copy | ||
|
||
def _reset(self): | ||
"""Reset internal data-dependent state of the scaler, if necessary. | ||
__init__ parameters are not touched. | ||
""" | ||
|
||
# Checking one attribute is enough, becase they are all set together | ||
# in partial_fit | ||
if hasattr(self, 'scale_'): | ||
del self.scale_ | ||
del self.n_samples_seen_ | ||
del self.mean_ | ||
del self.var_ | ||
|
||
def fit(self, X, y=None): | ||
"""Compute the mean and std to be used for later scaling. | ||
"""Compute the mean and std for both row and column dimensions. | ||
Parameters | ||
---------- | ||
X : {array-like, sparse matrix}, shape [n_samples, n_features] | ||
X : {array-like}, shape [n_rows, n_cols] | ||
The data used to compute the mean and standard deviation | ||
used for later scaling along the features axis. | ||
y : Passthrough for ``Pipeline`` compatibility. | ||
along both row and column axes | ||
y : Passthrough for ``Pipeline`` compatibility. Input is ignored. | ||
""" | ||
|
||
# Reset internal state before fitting | ||
self._reset() | ||
return self.partial_fit(X, y) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any reason that we cant just take the guts of partial fit and put it in fit? Since this function is just an interface wrapper over the other w no changes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. addressing this by removing partial_fit |
||
|
||
def transform(self, X, y='deprecated', copy=None): | ||
"""Perform standardization by centering and scaling | ||
Parameters | ||
---------- | ||
X : array-like, shape [n_samples, n_features] | ||
X : array-like, shape [n_rows, n_cols] | ||
The data used to scale along the features axis. | ||
y : (ignored) | ||
.. deprecated:: 0.19 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noqa shouldnt be needed for the docstring, please format to 80 chars if possible