Data Preparation

Display data prepare result using the BikeSharing dataset as example.

Experiment initialization and data preparation

import numpy as np
from piml import Experiment

exp = Experiment()
exp.data_loader(data="BikeSharing", silent=True)

Random split

exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
                 split_method='random', test_ratio=0.2, random_state=0)
             Config       Value
0  Excluded columns          []
1   Target variable         cnt
2     Sample weight        None
3         Task type  regression
4      Split method      random
5        Test ratio         0.2
6      Random state           0

Outer-sample-based split

exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
                 split_method='outer-sample', test_ratio=0.2, random_state=0)
             Config         Value
0  Excluded columns            []
1   Target variable           cnt
2     Sample weight          None
3         Task type    regression
4      Split method  outer-sample
5        Test ratio           0.2
6      Random state             0

KMeans-based split

exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
                 split_method='kmeans', test_ratio=[0.0, 1.0, 0.0], random_state=0)
             Config       Value
0  Excluded columns          []
1   Target variable         cnt
2     Sample weight        None
3         Task type  regression
4      Split method      kmeans
5        Test ratio    0.420277
6      Random state           0

Custom split

custom_train_idx = np.arange(0, 16000)
custom_test_idx = np.arange(16000, 17379)
exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
                 train_idx=custom_train_idx, test_idx=custom_test_idx)
             Config       Value
0  Excluded columns          []
1   Target variable         cnt
2     Sample weight        None
3         Task type  regression
4      Split method      manual
5        Test ratio    0.079349
6      Random state           0

Total running time of the script: ( 0 minutes 35.172 seconds)

Estimated memory usage: 21 MB

Gallery generated by Sphinx-Gallery