8.1. BikeSharing Data

This notebook demonstrates how to use PiML in its low-code mode for developing machine learning models for the BikeSharing data from UCI repository, which consists of 17,389 samples of hourly counts of rental bikes in Capital bikeshare system; see details here.

The response cnt (hourly bike rental counts) is continuous and it is a regression problem.

Click the ipynb links to run examples in Google Colab.

8.1.1. Load and Prepare Data

[1]:
from piml import Experiment
exp = Experiment()
[2]:
# Choose BikeSharing
exp.data_loader()
[3]:
# Exclude these features one-by-one: "yr", "mnth", "temp"
exp.data_summary()
[4]:
# Prepare dataset with default settings
exp.data_prepare()
[5]:
exp.feature_select()
[6]:
# Exploratory data analysis, check distribution and correlation
exp.eda()

8.1.2. Train Intepretable Models

[7]:
# Choose model(s), customize if needed, click Run to train, then register the trained models one by one.
exp.model_train()

8.1.3. Interpretability and Explainability

[8]:
# Model-specific inherent interpretability:  feature importance, main effects and pairwise interactions.
exp.model_interpret()
[9]:
# Model-agnostic post-hoc explanability: global methods (PFI, PDP, ALE) and local methods (LIME, SHAP)
exp.model_explain()

8.1.4. Model Diagnostics and Outcome Testing

[10]:
exp.model_diagnose()

8.1.5. Model Comparison and Benchmarking

[11]:
exp.model_compare()
[ ]: