8.1. BikeSharing Data¶

This notebook demonstrates how to use PiML in its low-code mode for developing machine learning models for the BikeSharing data from UCI repository, which consists of 17,389 samples of hourly counts of rental bikes in Capital bikeshare system; see details here.

The response cnt (hourly bike rental counts) is continuous and it is a regression problem.

Click the ipynb links to run examples in Google Colab.

8.1.1. Load and Prepare Data¶

[1]:

from piml import Experiment
exp = Experiment()

[2]:

# Choose BikeSharing
exp.data_loader()

[3]:

# Exclude these features one-by-one: "yr", "mnth", "temp"
exp.data_summary()

[4]:

# Prepare dataset with default settings
exp.data_prepare()

[5]:

exp.feature_select()

[6]:

# Exploratory data analysis, check distribution and correlation
exp.eda()

8.1.2. Train Intepretable Models¶

[7]:

# Choose model(s), customize if needed, click Run to train, then register the trained models one by one.
exp.model_train()

8.1.3. Interpretability and Explainability¶

[8]:

# Model-specific inherent interpretability:  feature importance, main effects and pairwise interactions.
exp.model_interpret()

[9]:

# Model-agnostic post-hoc explanability: global methods (PFI, PDP, ALE) and local methods (LIME, SHAP)
exp.model_explain()

8.1.4. Model Diagnostics and Outcome Testing¶

[10]:

exp.model_diagnose()

8.1.5. Model Comparison and Benchmarking¶

[11]:

exp.model_compare()

[ ]: