Note
Go to the end to download the full example code or to run this example in your browser via Binder
Feature Selection¶
Four built-in feature selection strategies using the BikeSharing dataset as example.
Experiment initialization and data preparation
from piml import Experiment
exp = Experiment()
exp.data_loader(data="BikeSharing", silent=True)
exp.data_prepare(target="cnt", task_type="regression", silent=True)
Feature selections using Pearson correlation strategy
exp.feature_select(method="cor", corr_algorithm="pearson", threshold=0.1, figsize=(5, 4))
data:image/s3,"s3://crabby-images/79086/790864a64ca67179067e1fb78d37db7a390e0098" alt="Pearson Correlation (Top10)"
Feature selections using Spearman correlation strategy
exp.feature_select(method="cor", corr_algorithm="spearman", threshold=0.1, figsize=(5, 4))
data:image/s3,"s3://crabby-images/85153/85153153ca47c5118ca51f221f7dc7dfe3aa42aa" alt="Spearman Correlation (Top10)"
Feature selections using distance correlation strategy
exp.feature_select(method="dcor", threshold=0.1, figsize=(5, 4))
data:image/s3,"s3://crabby-images/1256e/1256ec8962f5c2c8d84f699ccd592a1980cfa15c" alt="Distance Correlation (Top10)"
Feature selection using permutation feature importance strategy
exp.feature_select(method="pfi", threshold=0.95, figsize=(5, 4))
data:image/s3,"s3://crabby-images/f6772/f677292dffbf723a9371a300f97a0ba02002c3b1" alt="XGB-based Feature Importance (Top10)"
Feature selection using randomized conditional independence test strategy
exp.feature_select(method="rcit", threshold=0.001, n_forward_phase=2, kernel_size=100, figsize=(5, 4))
data:image/s3,"s3://crabby-images/9f7a2/9f7a2db73b1cf166f02aa414c8720fbe7ec4cc7b" alt="RCIT"
Feature selection using randomized conditional independence test strategy, where the initial Markov boundary is non-empty
exp.feature_select(method="rcit", threshold=0.001, preset=["hr", "temp"], figsize=(5, 4))
data:image/s3,"s3://crabby-images/caf04/caf0448411ac7195a458ee1a40c5e3fa1f0f6f6c" alt="RCIT"
Total running time of the script: ( 1 minutes 6.865 seconds)
Estimated memory usage: 973 MB