Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelization of SHAP fails on certain models #357

Open
hensontauro opened this issue Mar 3, 2021 · 1 comment
Open

Parallelization of SHAP fails on certain models #357

hensontauro opened this issue Mar 3, 2021 · 1 comment
Labels
Blocked Issue is blocked by some bigger issue internal-mle Priority: Medium Type: Bug Something isn't working Type: HPC High performance extensions (distributed, out-of-core computing etc.)

Comments

@hensontauro
Copy link

hensontauro commented Mar 3, 2021

I have trained an Autoencoder classifier and I am trying to get SHAP computations to run on multiple CPUs. However, I get an exception as follows -

TypeError: can't pickle _thread.RLock objects

Here is the full sample code (reproducible) for your reference -

import pandas as pd
from pyod.models.auto_encoder import AutoEncoder
import time
from alibi.explainers import KernelShap

random_state = np.random.RandomState(42)



# Generating fictional data
N_TRAIN_OBS = 19248
N_TEST_OBS = 60
N_FEATURES = 12
SUMMARY_SIZE = 15
SUMMARY_TYPE = 'KMEANS'

def generate_fake_data(N_TRAIN_OBS, N_TEST_OBS, N_FEATURES,
                      SUMMARY_SIZE, SUMMARY_TYPE):
    feature_dct = {}
    for i in range(N_FEATURES):
        if i == 5:
            feature_dct["F{}".format(i)] = feature_dct["F1"] + random_state.normal(0, 0.1, N_TRAIN_OBS)
        elif i == 10:
            feature_dct["F{}".format(i)] = feature_dct["F8"] + random_state.normal(0, 0.1, N_TRAIN_OBS)
        else:
            feature_dct["F{}".format(i)] = list(random_state.normal(0, 0.2, N_TRAIN_OBS))

    df_train = pd.DataFrame.from_dict(feature_dct)

    feature_dct = {}
    for i in range(N_FEATURES):
        if i == 0 or i == 7:
            print("{} is different".format(i))
            feature_dct["F{}".format(i)] = list(random_state.normal(1, 0.2, N_TEST_OBS))
        elif i == 5:
            feature_dct["F{}".format(i)] = feature_dct["F1"] + random_state.normal(0, 0.1, N_TEST_OBS)
        elif i == 10:
            feature_dct["F{}".format(i)] = feature_dct["F8"] + random_state.normal(0, 0.1, N_TEST_OBS)
        else:
            feature_dct["F{}".format(i)] = list(random_state.normal(0, 0.2, N_TEST_OBS))

    df_test = pd.DataFrame.from_dict(feature_dct)
    
    return df_train, df_test

df_train, df_test = generate_fake_data(N_TRAIN_OBS, N_TEST_OBS, N_FEATURES, SUMMARY_SIZE, SUMMARY_TYPE)

outliers_fraction = 0.001
hidden_neurons=[len(df_train.columns)-1]


# Training the classifier
def train_ANN(df_train, outliers_fraction, hidden_neurons):
    cls_ann = AutoEncoder(epochs=30, hidden_neurons=hidden_neurons, contamination=outliers_fraction, verbose = 0)
    cls_ann.fit(df_train)
    return cls_ann

cls_ann = train_ANN(df_train, outliers_fraction, hidden_neurons)

# Summarise SHAP for faster computation
if SUMMARY_TYPE == 'KMEANS':
    df_train_summary = shap.kmeans(df_train, SUMMARY_SIZE)
else:
    df_train_summary = shap.sample(df_train, nsamples=SUMMARY_SIZE, random_state=random_state)



# Compute SHAP values
explainer_ann = KernelShap(cls_ann.predict_proba, df_train, distributed_opts={'n_cpus': 6}) 
explainer_ann.fit(df_train_summary)
shapvals_ann = explainer_ann.explain(df_test.values)

This code works for other models, but fails for AutoEncoder. I did some research and found that it springs from the Ray library (used by Alibi for parallelization), which in turn springs from the way TensorFlow models are serialized.

Any ideas on how to solve this issue would be appreciated.
Thanks

@jklaise
Copy link
Contributor

jklaise commented Mar 3, 2021

Thanks for highlighting this @hensontauro. We would have to dive deeper into the limitations of pickling tensorflow models and also see if the Ray serialisation approach can be extended to handle them. I have a feeling this may not be a quick fix.

@jklaise jklaise added Type: Bug Something isn't working Type: HPC High performance extensions (distributed, out-of-core computing etc.) labels Mar 3, 2021
@jklaise jklaise added the Blocked Issue is blocked by some bigger issue label Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked Issue is blocked by some bigger issue internal-mle Priority: Medium Type: Bug Something isn't working Type: HPC High performance extensions (distributed, out-of-core computing etc.)
Projects
None yet
Development

No branches or pull requests

2 participants