This is a holistic approach to implement fair outputs at the individual and group level.
FairPut is a light open framework that describes a preferred process at the end of the machine learning pipeline to enchance model fairness. Developers and researchers first follow the normal table processing, table exploration, feature processing, feature extraction, and model validation steps to obtain the best possible model to maximise a certain metric like sales or profit. The FairPut methodology follows on from this initial process. The aim is to simultaneously enhance model interpretability, robustness, and fairness while maintaining a reasonable level of accuracy. FairPut unifies various recent machine learning constructs in a practical manner. This method is model agnostic, but this particular development instance uses LightGBM.
- Model Respecification
- Protected Values Prediction
- Model Constraints
- Hyperparameter Modelling
- Interpretable Model
- Global Explanations
- Monotonicity Feature Explanations
- Quantitative Validation
- Level Two Monotonicity
- Relationship Analysis
- Partial Dependence (LV1) Monotonicity
- Feature Interactions
- Metrics and Cut-off
- Residual Deviation
- Residual Explanations
- Benchmark Competition
- Adversarial Attack
- Group
- Disparate Error Analysis
- Parity Indicators
- Fair Lending Measures
- Model Agnostic Processing
- Reweighing Preprocessing
- Disparate Impact Preprocessing
- Calibrate Equalized Odds
- Feature Decomposition
- Disparate Error Analysis
- Individual
- Reasoning
- Individual Disparity
- Reasoning Codes
- Example Base
- Prototypical
- Counterfactual
- Contrastive
- Reasoning
If you end up using any of the novel techniques, or the framework as a whole, you can cite the following.
BibTeX entry:
@software{fairput,
title = {{FairPut}: Fair Machine Learning Framework},
author = {Snow, Derek},
url = {https://github.com/firmai/fairput/},
version = {1.15},
date = {2020-03-31},
}
Stack: Alibi, AIF360, AIX360, SHAP, PDPbox
- Can the model predict the outcome using just protected values? (Protected Value Prediction)
- Is the model monotonic and are variables randomly selected? (Model Constraints, LV1 & LV2 Monotonicity)
- Is the model explainable? (Model Selection, Feature Interactions)
- Can you explain the predictions globally and locally? (SHAP)
- Does the model perform well? (Metrics)
- What individuals have received the most and least accurate predictions? (Residual Deviation)
- Can you point to the feature responsible for large individual residuals? (Residual Explanations)
- What feature values could potentially be outliers due to their misprediction? (Residual Explanations)
- Do some models perform better at predicting the outcomes for a certain type of individual? (Benchmark Competition)
- Can the model outcome be changed by artificially perturbing certain values of interest? (Adversarial Attack)
- Do certain groups suffer relative to others as measured through group statistics? (Parity Indicators, Fair Lending Measures)
- Can various data and prediction processing techniques improve these group statistics? (Model Agnostic Processing)
- What features are driving the structural differences between groups controlling for demographic factors? (Feature Decomposition)
- What individuals have received the most unfair prediction or treatment by the model? (Individual Disparity)
- Why did the model decide to predict a specific outcome for a particular individual or sub-group of individuals? (Reasoning Codes)
- What individuals are most similar to those receiving unfair treatment and were these individuals treated similar? (Prototypical)
- What individual is the closest related instance to a sample individual but has a different predicted outcome? (Counterfactual)
- What is the minimal feature perturbation necessary to switch an individual's prediction to another category? (Contrastive)
- What is the maximum perturbation possible while the model prediction remains the same? (Contrastive)