8.3. TaiwanCredit Data¶

This example notebook demonstrates how to use PiML in its low-code mode for developing machine learning models for the TaiwanCredit data from UCI repository, which consists of 30,000 credit card clients in Taiwan from 200504 to 200509; see details here. The data can be loaded from PiML and it is subject to slight preprocessing.

The response FlagDefault is binary and it is a classification problem.

Click the ipynb links to run examples in Google Colab.

8.3.1. Load and Prepare Data¶

[1]:

from piml import Experiment
exp = Experiment()

[2]:

# Choose TaiwanCredit
exp.data_loader()

[3]:

# Use only payment history attributes: Pay_1~6, BILL_AMT1~6 and PAY_AMT1~6
# Keep the response `FlagDefault`, while excluding all other variables
exp.data_summary()

[4]:

# Prepare dataset with default settings
exp.data_prepare()

[5]:

exp.eda()

8.3.2. Train Intepretable Models¶

[6]:

# Choose EBM and ReLU-DNN; Customize ReLU-DNN with L1_regularization = 0.0008; then register each trained model.
exp.model_train()

8.3.3. Interpretability and Explainability¶

[7]:

# Model-specific inherent interpretation including feature importance, main effects and pairwise interactions.
exp.model_interpret()

[8]:

# Model-agnostic post-hoc explanation by Permutation Feature Importance, PDP (1D and 2D) vs. ALE (1D and 2D), LIME vs. SHAP
exp.model_explain()

8.3.4. Model Diagnostics and Outcome Testing¶

[9]:

exp.model_diagnose()

8.3.5. Model Comparison and Benchmarking¶

[10]:

exp.model_compare()

[ ]: