8.5. Fairness Simulation Study 2¶
Click the ipynb links to run examples in Google Colab.
8.5.1. Data Description¶
This example demonstrates the use of PiML for fairness testing. We first simulate a credit decisioning data with hypothesized features Mortgage
, Balance
, Amount Past Due
, Delinquency status
, Credit Inquiry
, Open Trade
, Utilization
, as well as demographic features Gender
and Race
. The response Status
is a binary indicator, and this is a classification problem.
Sample Size: 20k, Columns Num: 10.
Features:
Mortgage
(Numerical): Applicant’s mortgage size.
Balance
(Numerical): Average last 12 months credit card balance.
Amount Past Due
(Numerical): The minimum required payment that was not applied to the account as of the last payment due date.
Delinquency Status
(Ordinal): 0: current, 1: < 30 day delinquent, 2: 30-60 day delinquent, 3: 60-90 day and so on.
Credit Inquiry
(Ordinal): Number of credit inquiries in the last 12 months.
Open Trade
(Ordinal): Number of open credit accounts.
Utilization
(Numerical): % credit utilization, the sum of all your balances, divided by the sum of your cards’ credit limits.
Demographic Features:
Demographic features in Credit data cannot be used for modeling.
Gender
(Categorical): Two kinds of gender.
Race
(Categorical): Two kinds of race.
Target Response:
Status
(Categorical): 0: default (should not be approved) and 1: non-default (should be approved). The 0/1 ratio is nearly 1:5.
8.5.2. Load and Prepare data¶
[2]:
from piml import Experiment
exp = Experiment()
[3]:
## Choose SimuCredit
exp.data_loader()
[4]:
# Exclude features one-by-one: "Gender", "Race" (demographic variables);
# Excluded features will show in grey color in the table.
exp.data_summary()
[5]:
# Prepare dataset with Test Ratio = 0.2
exp.data_prepare()
[6]:
# Exploratory Data Analysis
exp.eda()
8.5.3. Train ML Model(s)¶
[7]:
# train and register GLM, XGB-depth2
exp.model_train()
[8]:
# Manual train and register XGB-depth7
from xgboost import XGBClassifier
exp.model_train(XGBClassifier(max_depth=7), name='XGB7')
8.5.4. Fairness Testing¶
First select a registered model (in this case, XGB2)
Group Setting:
Set Add Category = “Gender”, select “1.0” as reference, select “0.0” as protected, then click “Add”.
Set Add Category = “Race”, select “1.0” as reference, select “0.0” as protected, then click “Add”.
Distribution Shift:
Set distance metric to ‘PSI’
To evaluate the distribution shift between the samples of reference groups and protected groups.
[9]:
exp.model_fairness()
Metrics Tab:
Select a metric (AIR, by default) and set the threshold. (e.g. 0.8)
Set the favorable threshold (0.5, by defaut) and favorable class. (1 or 0)
[10]:
exp.model_fairness()
Segmented Metrics:
Select the Balance as the segment feature and the metric AIR, and set the metric threshold.
If the segment feature is numerical, set the number of bins. (5 by default)
We can find that the higher balance, the lower AIR of Gender and Race.
[11]:
exp.model_fairness()
Debiasing/unfairness mitigation by Feature Binning
6.1 Select a fairness metric (AIR by default) and a performance metric (F1)
6.2 Select an attribute (Balance), binning method (Quantile by default) and number of bins (5 by default)
6.3 Click the button “ADD” to apply the binning setting to the data.
6.2 & 6.3 can be repeat many time for different attribute.
Click button “CLEAR ALL” could remove all the record.
[12]:
exp.model_fairness()
Debiasing/unfairness mitigation by Threshold Adjustment:
Select a fairness metric (AIR by default) and a performance metric. (ACC by default)
Set the favorable threshold and class.
The number of threshold values is 20. (default for low-code)
Check the fairness and performance metrics for varying thresholds.
For this model, when we choose threshold as 0.37, the model can get both good fairness and performance.
[13]:
exp.model_fairness()
8.5.5. Fairness Testing Comparison¶
First select registered models (in this case, GLM, XGB2, XGB7)
Group Setting:
Set Add Category = “Gender”, select “1.0” as reference, select “0.0” as protected, then click “Add”.
Set Add Category = “Race”, select “1.0” as reference, select “0.0” as protected, then click “Add”.
Metrics Tab:
Select a metric (AIR, by default) and set the threshold. (e.g. 0.8)
Set the favorable threshold (0.5, by defaut) and favorable class. (1 or 0)
[14]:
exp.model_fairness_compare()
Segmented Metrics:
Select the segment feature and the metric, and set the metric threshold.
If the segment feature is numerical, set the number of bins. (5 by default)
[15]:
exp.model_fairness_compare()
[ ]: