The LOCO-MP python package provide a mostly model-agnostic and distribution-free inference framework for feature importance that is applicable to both regression and classification tasks and is computationally efficient and statistically powerful. The motivation for this method is described in Gan, L., Zheng, L., & Allen, G. I. (2022) link to preprint. Please visit https://github.com/DataSlingers/LOCOMP_paperfor code to reproduce the results from our paper.
LOCOMP 0.1 and later require Python 3.7 or Python 3.8.
Clone this repo and run
python setup.py install
The dependencies of this package will be automatically installed into your environment.
The LOCO-MP function requires base estimator functions which takes arguments (training data matrix X, training response Y, testing data matrix X1) as input and predictions of X1 as output for regression; (training data matrix X, training response Y) as as input and fitted model as output for classification. Users can import base estimators including ridge, random forest and support vector machine from ML_models. Examples:
def ridge2(X,Y,X1):
clf = Ridge(fit_intercept = False,alpha=0.001).fit(X, Y)
return clf.predict(X1)
def logitridge(X,Y):
fit=LogisticRegression(penalty='l2',solver='saga',max_iter=10,C = 1000).fit(X,Y)
return fit
The detailed API documentation for this package can be found at [https://DataSlingers.github.io/LOCOMP]
Gan L, Zheng L, Allen G I. Inference for Interpretable Machine Learning: Fast, Model-Agnostic Confidence Intervals for Feature Importance[J]. arXiv preprint arXiv:2206.02088, 2022.