Note

Go to the end to download the full example code or to run this example in your browser via Binder

Register sklearn Style Models¶

Assume we have sklearn style models fitted outside PiML workflow

For demonstration, we fit a model using XGBoost’s sklearn API

from xgboost import XGBRegressor
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split

data = fetch_california_housing()
train_x, test_x, train_y, test_y = train_test_split(data.data, data.target, test_size=0.2)
feature_names = data.feature_names
target_name = data.target_names[0]

xgb2 = XGBRegressor(max_depth=2, n_estimators=100)
xgb2.fit(train_x, train_y)

xgb7 = XGBRegressor(max_depth=7, n_estimators=100)
xgb7.fit(train_x, train_y)

XGBRegressor(base_score=None, booster=None, callbacks=None,
             colsample_bylevel=None, colsample_bynode=None,
             colsample_bytree=None, early_stopping_rounds=None,
             enable_categorical=False, eval_metric=None, feature_types=None,
             gamma=None, gpu_id=None, grow_policy=None, importance_type=None,
             interaction_constraints=None, learning_rate=None, max_bin=None,
             max_cat_threshold=None, max_cat_to_onehot=None,
             max_delta_step=None, max_depth=7, max_leaves=None,
             min_child_weight=None, missing=nan, monotone_constraints=None,
             n_estimators=100, n_jobs=None, num_parallel_tree=None,
             predictor=None, random_state=None, ...)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Load PiML

from piml import Experiment
exp = Experiment(highcode_only=True)

pipeline_xgb2 = exp.make_pipeline(model=xgb2,
                                  train_x=train_x,
                                  train_y=train_y.ravel(),
                                  test_x=test_x,
                                  test_y=test_y.ravel(),
                                  feature_names=feature_names,
                                  target_name=target_name)
exp.register(pipeline_xgb2, "XGB-External-2")

pipeline_xgb7 = exp.make_pipeline(model=xgb7,
                                  train_x=train_x,
                                  train_y=train_y.ravel(),
                                  test_x=test_x,
                                  test_y=test_y.ravel(),
                                  feature_names=feature_names,
                                  target_name=target_name)
exp.register(pipeline_xgb7, "XGB-External-7")

Check model performance

exp.model_diagnose(model="XGB-External-2", show="accuracy_table")

          MSE     MAE       R2

Train  0.2509  0.3531   0.8112
Test   0.2857  0.3691   0.7872
Gap    0.0348  0.0160  -0.0240

Check model performance

exp.model_diagnose(model="XGB-External-7", show="accuracy_table")

          MSE     MAE       R2

Train  0.0508  0.1594   0.9617
Test   0.2363  0.3158   0.8240
Gap    0.1855  0.1563  -0.1377

Compare model robustness

exp.model_compare(models=["XGB-External-2", "XGB-External-7"], show="robustness_perf", figsize=(5, 4))

Model Performance: Perturb on All Features

Compare model resilience

exp.model_compare(models=["XGB-External-7", "XGB-External-2"], show="resilience_perf", figsize=(5, 4))

Total running time of the script: ( 0 minutes 51.249 seconds)

Estimated memory usage: 40 MB

Gallery generated by Sphinx-Gallery