piml.scored_test
.test_weakspot¶
- piml.scored_test.test_weakspot(x, y, prediction, prediction_proba=None, feature_names=None, feature_types=None, target_name=None, train_idx=None, test_idx=None, task_type=None, random_state=None, slice_method=None, slice_features=None, metric=None, bins=None, threshold=None, min_samples=None, use_test=None, figsize=None)¶
Get marginal weakspot result based on a given feature.
- Parameters:
- xndarray of shape (n_samples, n_features), default=None
The covariate data.
- yndarray of shape (n_samples, ), default=None
The actual response.
- predicitonndarray of shape (n_samples, ), default=None
The model prediction for regression tasks and predicted class for binary classification tasks.
- prediciton_probandarray of shape (n_samples, ), default=None
The predicted probability of the positive class in binary classificaiton tasks. Not need for regression tasks.
- task_type{‘regression’, ‘classification’}, default=None
The task type.
- feature_nameslist, default=None
Feature names.
- feature_types: list, default=None
Feature types, can be ‘numerical’ or ‘categorical’.
- target_namestr, default=None
Target name.
- train_idxarray-like of shape (n_samples_train,), default=None
If train_idx and test_idx are not None, it will be ignored.
- test_idxarray-like of shape (n_samples_test,), default=None
If train_idx and test_idx are not None, it will be ignored.
- random_stateint, default=None
Random seed for train / test split. If None, it will be 0.
- slice_featureslist, default=None
List of slicing features (at most 2) for Weakspot test.
- slice_method{‘histogram’, ‘tree’}, default=None
The slicing method for WeakSpot and Overfit tests. If None, it will be ‘histogram’.
‘histogram’: default, use equal-space binning;
‘tree’: fit a decision tree to generate regions.
- metric{‘MSE’, ‘MAE’, ‘R2’, ‘ACC’, ‘AUC’, ‘F1’, ‘LogLoss’, ‘Brier’}, default=None
Performance metric.
For classification tasks: ‘ACC’, ‘AUC’, ‘F1’, ‘LogLoss’, ‘Brier’.
For regression tasks: ‘MSE’, ‘MAE’, ‘R2’.
- binsint, default=None
The number of bins. If None, it will be 10.
- thresholdfloat, default=None
The minimal error gap for an overfit region. If None, it will be 1.1.
- min_samplesint, default=None
The minimal sample size for selected regions. If None, it will be 20.
- use_testbool, default=None
Whether to use test data or not. If None, it will be False.
- figsizetuple, default=None
Figure size of the plot. If None, it will be (8, 6).