Note

Go to the end to download the full example code or to run this example in your browser via Binder

Data Preparation¶

Display data prepare result using the BikeSharing dataset as example.

Experiment initialization and data preparation

import numpy as np
from piml import Experiment

exp = Experiment()
exp.data_loader(data="BikeSharing", silent=True)

Random split

exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
                 split_method='random', test_ratio=0.2, random_state=0)

             Config       Value
Excluded columns          []
 Target variable         cnt
   Sample weight        None
       Task type  regression
    Split method      random
      Test ratio         0.2
    Random state           0

Outer-sample-based split

exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
                 split_method='outer-sample', test_ratio=0.2, random_state=0)

             Config         Value
Excluded columns            []
 Target variable           cnt
   Sample weight          None
       Task type    regression
    Split method  outer-sample
      Test ratio           0.2
    Random state             0

KMeans-based split

exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
                 split_method='kmeans', test_ratio=[0.0, 1.0, 0.0], random_state=0)

             Config       Value
Excluded columns          []
 Target variable         cnt
   Sample weight        None
       Task type  regression
    Split method      kmeans
      Test ratio    0.420277
    Random state           0

Custom split

custom_train_idx = np.arange(0, 16000)
custom_test_idx = np.arange(16000, 17379)
exp.data_prepare(target='cnt', task_type='regression', sample_weight=None,
                 train_idx=custom_train_idx, test_idx=custom_test_idx)

             Config       Value
Excluded columns          []
 Target variable         cnt
   Sample weight        None
       Task type  regression
    Split method      manual
      Test ratio    0.079349
    Random state           0

Total running time of the script: ( 0 minutes 35.172 seconds)

Estimated memory usage: 21 MB

Gallery generated by Sphinx-Gallery