8.1. BikeSharing Data¶
This notebook demonstrates how to use PiML in its low-code mode for developing machine learning models for the BikeSharing data from UCI repository, which consists of 17,389 samples of hourly counts of rental bikes in Capital bikeshare system; see details here.
The response cnt
(hourly bike rental counts) is continuous and it is a regression problem.
Click the ipynb links to run examples in Google Colab.
8.1.1. Load and Prepare Data¶
[1]:
from piml import Experiment
exp = Experiment()
[2]:
# Choose BikeSharing
exp.data_loader()
[3]:
# Exclude these features one-by-one: "yr", "mnth", "temp"
exp.data_summary()
[4]:
# Prepare dataset with default settings
exp.data_prepare()
[5]:
exp.feature_select()
[6]:
# Exploratory data analysis, check distribution and correlation
exp.eda()
8.1.2. Train Intepretable Models¶
[7]:
# Choose model(s), customize if needed, click Run to train, then register the trained models one by one.
exp.model_train()
8.1.3. Interpretability and Explainability¶
[8]:
# Model-specific inherent interpretability: feature importance, main effects and pairwise interactions.
exp.model_interpret()
[9]:
# Model-agnostic post-hoc explanability: global methods (PFI, PDP, ALE) and local methods (LIME, SHAP)
exp.model_explain()
8.1.4. Model Diagnostics and Outcome Testing¶
[10]:
exp.model_diagnose()
8.1.5. Model Comparison and Benchmarking¶
[11]:
exp.model_compare()
[ ]: