WeakSpot: Regression

Experiment initialization and data preparation

from piml import Experiment
from piml.models import XGB2Regressor

exp = Experiment()
exp.data_loader(data="BikeSharing", silent=True)
exp.data_summary(feature_exclude=["yr", "mnth", "temp"], silent=True)
exp.data_prepare(target="cnt", task_type="regression", silent=True)

Train Model

exp.model_train(model=XGB2Regressor(), name="XGB2")

Histogram-based weakspot for a single feature

results = exp.model_diagnose(model="XGB2", show="weakspot", slice_method="histogram",
                             slice_features=["hr"], threshold=1.1, min_samples=100,
                             return_data=True, figsize=(5, 4))
results.data
Weak Regions
[hr hr) #Test #Train test_MSE train_MSE Gap
0 0.3 0.4 445 1736 0.0219 0.0197 0.0022
1 0.7 0.8 290 1168 0.0307 0.0304 0.0003


Histogram-based weakspot for two features

results = exp.model_diagnose(model="XGB2", show="weakspot", slice_method="histogram",
                             slice_features=["hr", "workingday"], threshold=1.1, min_samples=100,
                             return_data=True, figsize=(5, 4))
results.data
Weak Regions
[hr hr) [workingday workingday) #Test #Train test_MSE train_MSE Gap
0 0.3 0.4 0.0 1.0 445 1736 0.0219 0.0197 2.2042e-03
1 0.5 0.6 0.0 0.5 85 377 0.0219 0.0207 1.1186e-03
2 0.7 0.8 0.0 1.0 290 1168 0.0307 0.0304 3.0630e-04
3 0.6 0.7 0.0 0.5 155 538 0.0159 0.0158 9.2337e-05
4 0.4 0.5 0.0 0.5 97 365 0.0103 0.0110 -6.7680e-04


Histogram-based weakspot for a single feature on test set

results = exp.model_diagnose(model="XGB2", show="weakspot", slice_method="histogram",
                             slice_features=["hr"], threshold=1.1, min_samples=100,
                             use_test=True, return_data=True, figsize=(5, 4))
results.data
Weak Regions
[hr hr) #Test #Train test_MSE train_MSE Gap
0 0.3 0.4 445 1736 0.0219 0.0197 0.0022
1 0.7 0.8 290 1168 0.0307 0.0304 0.0003


Histogram-based weakspot for a single feature using MAE metric

results = exp.model_diagnose(model="XGB2", show="weakspot", slice_method="histogram",
                             slice_features=["hr"], threshold=1.1, min_samples=100,
                             metric="MAE", return_data=True, figsize=(5, 4))
results.data
Weak Regions
[hr hr) #Test #Train test_MAE train_MAE Gap
0 0.3 0.4 445 1736 0.1163 0.1106 0.0058
1 0.6 0.8 735 2911 0.1046 0.1031 0.0015


Tree-based weakspot for a single feature using MAE metric

results = exp.model_diagnose(model="XGB2", show="weakspot", slice_method="tree",
                             slice_features=["hr"], threshold=1.1, min_samples=100,
                             metric="MAE", return_data=True, figsize=(5, 4))
results.data
Weak Regions
[hr hr) #Test #Train test_MAE train_MAE Gap
0 0.2826 0.3696 283 1171 0.1495 0.1352 0.0143
1 0.3696 1.0000 2214 8710 0.0749 0.0747 0.0002


Total running time of the script: ( 0 minutes 56.209 seconds)

Estimated memory usage: 22 MB

Gallery generated by Sphinx-Gallery