Note
Go to the end to download the full example code or to run this example in your browser via Binder
Data Preparation¶
Display data prepare result using the BikeSharing dataset as example.
Experiment initialization and data preparation
import numpy as np
from piml import Experiment
exp = Experiment()
exp.data_loader(data="BikeSharing", silent=True)
Random split
exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
split_method='random', test_ratio=0.2, random_state=0)
Config Value
0 Excluded columns []
1 Target variable cnt
2 Sample weight None
3 Task type regression
4 Split method random
5 Test ratio 0.2
6 Random state 0
Outer-sample-based split
exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
split_method='outer-sample', test_ratio=0.2, random_state=0)
Config Value
0 Excluded columns []
1 Target variable cnt
2 Sample weight None
3 Task type regression
4 Split method outer-sample
5 Test ratio 0.2
6 Random state 0
KMeans-based split
exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
split_method='kmeans', test_ratio=[0.0, 1.0, 0.0], random_state=0)
Config Value
0 Excluded columns []
1 Target variable cnt
2 Sample weight None
3 Task type regression
4 Split method kmeans
5 Test ratio 0.420277
6 Random state 0
Custom split
Config Value
0 Excluded columns []
1 Target variable cnt
2 Sample weight None
3 Task type regression
4 Split method manual
5 Test ratio 0.079349
6 Random state 0
Total running time of the script: ( 0 minutes 35.172 seconds)
Estimated memory usage: 21 MB