8.3. TaiwanCredit Data

This example notebook demonstrates how to use PiML in its low-code mode for developing machine learning models for the TaiwanCredit data from UCI repository, which consists of 30,000 credit card clients in Taiwan from 200504 to 200509; see details here. The data can be loaded from PiML and it is subject to slight preprocessing.

The response FlagDefault is binary and it is a classification problem.

Click the ipynb links to run examples in Google Colab.

8.3.1. Load and Prepare Data

[1]:
from piml import Experiment
exp = Experiment()
[2]:
# Choose TaiwanCredit
exp.data_loader()
[3]:
# Use only payment history attributes: Pay_1~6, BILL_AMT1~6 and PAY_AMT1~6
# Keep the response `FlagDefault`, while excluding all other variables
exp.data_summary()
[4]:
# Prepare dataset with default settings
exp.data_prepare()
[5]:
exp.eda()

8.3.2. Train Intepretable Models

[6]:
# Choose EBM and ReLU-DNN; Customize ReLU-DNN with L1_regularization = 0.0008; then register each trained model.
exp.model_train()

8.3.3. Interpretability and Explainability

[7]:
# Model-specific inherent interpretation including feature importance, main effects and pairwise interactions.
exp.model_interpret()
[8]:
# Model-agnostic post-hoc explanation by Permutation Feature Importance, PDP (1D and 2D) vs. ALE (1D and 2D), LIME vs. SHAP
exp.model_explain()

8.3.4. Model Diagnostics and Outcome Testing

[9]:
exp.model_diagnose()

8.3.5. Model Comparison and Benchmarking

[10]:
exp.model_compare()
[ ]: