piml.scored_test.test_resilience_perf

piml.scored_test.test_resilience_perf(x, y, prediction, prediction_proba=None, feature_names=None, feature_types=None, target_name=None, train_idx=None, test_idx=None, task_type=None, random_state=None, resilience_method=None, immu_feature=None, metric=None, figsize=None)

Get resilience test result in each step.

Parameters:
xndarray of shape (n_samples, n_features), default=None

The covariate data.

yndarray of shape (n_samples, ), default=None

The actual response.

predicitonndarray of shape (n_samples, ), default=None

The model prediction for regression tasks and predicted class for binary classification tasks.

prediciton_probandarray of shape (n_samples, ), default=None

The predicted probability of the positive class in binary classificaiton tasks. Not need for regression tasks.

task_type{‘regression’, ‘classification’}, default=None

The task type.

feature_nameslist, default=None

Feature names.

feature_types: list, default=None

Feature types, can be ‘numerical’ or ‘categorical’.

target_namestr, default=None

Target name.

train_idxarray-like of shape (n_samples_train,), default=None

If train_idx and test_idx are not None, it will be ignored.

test_idxarray-like of shape (n_samples_test,), default=None

If train_idx and test_idx are not None, it will be ignored.

random_stateint, default=None

Random seed for train / test split. If None, it will be 0.

resilience_method{‘worst-sample’, ‘hard-sample’, ‘outer-sample’, ‘worst-cluster’}, default=None

The method used for selecting worst samples. If None, it will be ‘worst-sample’.

  • ‘worst-sample’: Select the worst samples according to the loss of each sample; the worst samples are related to models.

  • ‘hard-sample’: Use a deep XGB model to distinguish hard and easy samples; the worst samples are the same for different models.

  • ‘outer-sample’: Use the Euclidean distance of each sample to the mean of X as a surrogate of worstness; the worst samples are the same for different models.

  • ‘worst-cluster’: Fit a K-means using the X, and then select the worst performing cluste as the worst samples; the worst samples are related to models.

immu_featurestr, default=None

The name of immutable feature. If None, it will be an empty list.

metric{‘MSE’, ‘MAE’, ‘R2’, ‘ACC’, ‘AUC’, ‘F1’, ‘LogLoss’, ‘Brier’}, default=None

Performance metric.

  • For classification tasks: ‘ACC’, ‘AUC’, ‘F1’, ‘LogLoss’, ‘Brier’.

  • For regression tasks: ‘MSE’, ‘MAE’, ‘R2’.

figsizetuple, default=None

Figure size of the plot. If None, it will be (8, 6).