Parallelization of SHAP fails on certain models #357

hensontauro · 2021-03-03T05:45:42Z

I have trained an Autoencoder classifier and I am trying to get SHAP computations to run on multiple CPUs. However, I get an exception as follows -

TypeError: can't pickle _thread.RLock objects

Here is the full sample code (reproducible) for your reference -

import pandas as pd
from pyod.models.auto_encoder import AutoEncoder
import time
from alibi.explainers import KernelShap

random_state = np.random.RandomState(42)



# Generating fictional data
N_TRAIN_OBS = 19248
N_TEST_OBS = 60
N_FEATURES = 12
SUMMARY_SIZE = 15
SUMMARY_TYPE = 'KMEANS'

def generate_fake_data(N_TRAIN_OBS, N_TEST_OBS, N_FEATURES,
                      SUMMARY_SIZE, SUMMARY_TYPE):
    feature_dct = {}
    for i in range(N_FEATURES):
        if i == 5:
            feature_dct["F{}".format(i)] = feature_dct["F1"] + random_state.normal(0, 0.1, N_TRAIN_OBS)
        elif i == 10:
            feature_dct["F{}".format(i)] = feature_dct["F8"] + random_state.normal(0, 0.1, N_TRAIN_OBS)
        else:
            feature_dct["F{}".format(i)] = list(random_state.normal(0, 0.2, N_TRAIN_OBS))

    df_train = pd.DataFrame.from_dict(feature_dct)

    feature_dct = {}
    for i in range(N_FEATURES):
        if i == 0 or i == 7:
            print("{} is different".format(i))
            feature_dct["F{}".format(i)] = list(random_state.normal(1, 0.2, N_TEST_OBS))
        elif i == 5:
            feature_dct["F{}".format(i)] = feature_dct["F1"] + random_state.normal(0, 0.1, N_TEST_OBS)
        elif i == 10:
            feature_dct["F{}".format(i)] = feature_dct["F8"] + random_state.normal(0, 0.1, N_TEST_OBS)
        else:
            feature_dct["F{}".format(i)] = list(random_state.normal(0, 0.2, N_TEST_OBS))

    df_test = pd.DataFrame.from_dict(feature_dct)
    
    return df_train, df_test

df_train, df_test = generate_fake_data(N_TRAIN_OBS, N_TEST_OBS, N_FEATURES, SUMMARY_SIZE, SUMMARY_TYPE)

outliers_fraction = 0.001
hidden_neurons=[len(df_train.columns)-1]


# Training the classifier
def train_ANN(df_train, outliers_fraction, hidden_neurons):
    cls_ann = AutoEncoder(epochs=30, hidden_neurons=hidden_neurons, contamination=outliers_fraction, verbose = 0)
    cls_ann.fit(df_train)
    return cls_ann

cls_ann = train_ANN(df_train, outliers_fraction, hidden_neurons)

# Summarise SHAP for faster computation
if SUMMARY_TYPE == 'KMEANS':
    df_train_summary = shap.kmeans(df_train, SUMMARY_SIZE)
else:
    df_train_summary = shap.sample(df_train, nsamples=SUMMARY_SIZE, random_state=random_state)



# Compute SHAP values
explainer_ann = KernelShap(cls_ann.predict_proba, df_train, distributed_opts={'n_cpus': 6}) 
explainer_ann.fit(df_train_summary)
shapvals_ann = explainer_ann.explain(df_test.values)

This code works for other models, but fails for AutoEncoder. I did some research and found that it springs from the Ray library (used by Alibi for parallelization), which in turn springs from the way TensorFlow models are serialized.

Any ideas on how to solve this issue would be appreciated.
Thanks

The text was updated successfully, but these errors were encountered:

jklaise · 2021-03-03T10:22:51Z

Thanks for highlighting this @hensontauro. We would have to dive deeper into the limitations of pickling tensorflow models and also see if the Ray serialisation approach can be extended to handle them. I have a feeling this may not be a quick fix.

jklaise added Type: Bug Something isn't working Type: HPC High performance extensions (distributed, out-of-core computing etc.) labels Mar 3, 2021

jklaise added internal-mle Priority: Medium labels Jul 14, 2021

jklaise mentioned this issue Jul 14, 2021

Example using KernelShap with distributed options #450

Open

jklaise added the Blocked Issue is blocked by some bigger issue label Jul 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelization of SHAP fails on certain models #357

Parallelization of SHAP fails on certain models #357

hensontauro commented Mar 3, 2021 •

edited

Loading

jklaise commented Mar 3, 2021

Parallelization of SHAP fails on certain models #357

Parallelization of SHAP fails on certain models #357

Comments

hensontauro commented Mar 3, 2021 • edited Loading

jklaise commented Mar 3, 2021

hensontauro commented Mar 3, 2021 •

edited

Loading