API Reference¶

This is the class and function reference of PiML. Please refer to the user guide for further details, as the class and function raw specifications may not be enough to give full guidelines on their uses.

Data Pipeline¶

Functions¶

`Experiment.data_loader`([data, silent, ...])	Load data for experimentation.
`Experiment.data_summary`([feature_type, ...])	Summarize basic data statistics.
`Experiment.data_prepare`([target, ...])	Prepare data for model fitting.
`Experiment.data_quality`([method, dataset, ...])	Check the data quality and remove outliers.
`Experiment.eda`([show, uni_feature, ...])	Run exploratory data analysis.
`Experiment.feature_select`([method, ...])	Select features that are important for modeling.
`Experiment.get_data`([x, y, sample_weight, ...])	Get the preprocessed train-test data.
`Experiment.get_raw_data`()	Get the raw train-test data.
`Experiment.get_feature_names`()	Get the input feature names.
`Experiment.get_feature_types`()	Get the data type of each input feature.
`Experiment.get_target_name`()	Get the target feature name.

Outlier Detection Algorithms¶

`data.outlier_detection.PCA`	A wrapper of sklearn's PCA for outlier detection.
`data.outlier_detection.CBLOF`	Cluster-based local outlier factor for outlier detection.
`data.outlier_detection.KMeansTree`	Recursive unsupervised splitting tree via KMeans (K=2).
`data.outlier_detection.IsolationForest`	A wrapper of sklearn's Isolation Forest for outlier detection.
`data.outlier_detection.OneClassSVM`	A wrapper of sklearn's OneClassSVM for outlier detection.
`data.outlier_detection.KNN`	A wrapper of sklearn's K-Nearest Neighbor-based for outlier detection.
`data.outlier_detection.HBOS`	A wrapper of PyOD's Histogram-based outlier detection (HBOS) for outlier detection.
`data.outlier_detection.ECOD`	A wrapper of PyOD's Cumulative Distribution Functions (ECOD) for Unsupervised Outlier Detection.

Model Training¶

`Experiment.model_train`([model, name])	Fit interpretable models.
`Experiment.model_tune`([model, method, ...])	Refit a model with new parameters.
`Experiment.model_save`(model[, path])	Save a PiML-trained model as a pickle file.
`Experiment.model_interpret`([model, show, ...])	Interpret inherently interpretable models.
`Experiment.make_pipeline`([model, task_type, ...])	Customize a pipeline.
`Experiment.register`(pipeline, name)	Register a pipeline.
`Experiment.get_model_list`()	Get the list of names of all registered models.
`Experiment.get_interpretable_model_list`()	Get the list of names of all registered interpretable models.
`Experiment.get_model`(model)	Get a registered pipeline.
`Experiment.get_model_config`(model)	Get the configuration of a model.
`Experiment.get_leaderboard`([metric])	Show the performance comparison table of all trained models.
`Experiment.get_leaderboard_registered`([metric])	Show the performance comparison table of all registered models.

Post-hoc Explainability¶

Experiment.model_explain([model, show, ...])

Explain an arbitrary fitted model using post-hoc explanation tools.

Interpretable Models¶

`models.GLMRegressor`	A wrapper of generalized linear model regressor in scikit-learn.
`models.GLMClassifier`	A wrapper of generalized linear model classifier in scikit-learn.
`models.GAMRegressor`	A wrapper of generalized additive model regressor in pygam.
`models.GAMClassifier`	A wrapper of generalized additive model classifier in pygam.
`models.TreeRegressor`	A wrapper of the decision tree regressor in scikit-learn.
`models.TreeClassifier`	A wrapper of the decision tree classifier in scikit-learn.
`models.FIGSRegressor`	Fast interpretable greedy-tree sums regressor.
`models.FIGSClassifier`	Fast interpretable greedy-tree sums classifier.
`models.XGB1Classifier`	Depth-1 XGBoostClassifier with optimal binning.
`models.XGB1Regressor`	Depth-1 XGBoostRegressor with optimal binning.
`models.XGB2Classifier`	Depth-2 XGBoostClassifier.
`models.XGB2Regressor`	Depth-2 XGBoostRegressor.
`models.ExplainableBoostingRegressor`	An Explainable Boosting Regressor based on interpret==0.4.2
`models.ExplainableBoostingClassifier`	An Explainable Boosting Classifier based on interpret==0.4.2
`models.GAMINetRegressor`	Generalized additive model with pairwise interaction regressor.
`models.GAMINetClassifier`	Generalized additive model with pairwise interaction classifier.
`models.ReluDNNRegressor`	Multi-layer perceptron regressor with ReLU activation function.
`models.ReluDNNClassifier`	Multi-layer perceptron classifier with ReLU activation function.

Outcome Testing¶

Integrated Functions¶

`Experiment.model_diagnose`([model, show, ...])	Test model performance using various diagnostic tools.
`Experiment.model_compare`([models, show, ...])	Compare the diagnostic results of multiple models.
`Experiment.model_fairness`([model, show, ...])	Test model fairness.
`Experiment.model_fairness_compare`([models, ...])	Compare the fairness results of multiple models.
`Experiment.model_fairness_solas`([model, ...])	Test model fairness based on solas-ai.
`Experiment.segmented_diagnose`([show, model, ...])	Test model performance using various diagnostic tools after bucketing.

Scored Test Function¶

`test_accuracy_table`	Get accuracy result.
`test_accuracy_residual`	Get marginal residual plot based on a given feature.
`test_accuracy_plot`	Plot confusion matrix, ROC and Recall-Precision, only supports classifiers.
`test_weakspot`	Get marginal weakspot result based on a given feature.
`test_overfit`	Get marginal overfit result based on a given feature.
`test_reliability_table`	Get empirical coverage and average bandwidth for regression or Brier Loss for classification.
`test_reliability_distance`	Compare data distance between reliable and unreliable samples.
`test_reliability_marginal`	Get marginal slicing reliability result based on a given feature.
`test_reliability_perf`	Get reliability diagram, only for classifiers.
`test_reliability_calibration`	Get the calibrated predicted probability vs.
`test_resilience_perf`	Get resilience test result in each step.
`test_resilience_distance`	Compare data distance between samples in the worst region and the remaining region.
`test_resilience_shift_histogram`	Compare marginal distribution histogram between the worst region and remaining region.
`test_resilience_shift_density`	Compare marginal distribution density between the worst region and remaining region.