.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples\0_data\plot_3_data_prepare.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_0_data_plot_3_data_prepare.py: Data Preparation ===================================== Display data prepare result using the BikeSharing dataset as example. .. GENERATED FROM PYTHON SOURCE LINES 10-11 Experiment initialization and data preparation .. GENERATED FROM PYTHON SOURCE LINES 11-17 .. code-block:: default import numpy as np from piml import Experiment exp = Experiment() exp.data_loader(data="BikeSharing", silent=True) .. GENERATED FROM PYTHON SOURCE LINES 18-19 Random split .. GENERATED FROM PYTHON SOURCE LINES 19-22 .. code-block:: default exp.data_prepare(target='cnt', task_type='regression', sample_weight=None, split_method='random', test_ratio=0.2, random_state=0) .. rst-class:: sphx-glr-script-out .. code-block:: none Config Value 0 Excluded columns [] 1 Target variable cnt 2 Sample weight None 3 Task type regression 4 Split method random 5 Test ratio 0.2 6 Random state 0 .. GENERATED FROM PYTHON SOURCE LINES 23-24 Outer-sample-based split .. GENERATED FROM PYTHON SOURCE LINES 24-27 .. code-block:: default exp.data_prepare(target='cnt', task_type='regression', sample_weight=None, split_method='outer-sample', test_ratio=0.2, random_state=0) .. rst-class:: sphx-glr-script-out .. code-block:: none Config Value 0 Excluded columns [] 1 Target variable cnt 2 Sample weight None 3 Task type regression 4 Split method outer-sample 5 Test ratio 0.2 6 Random state 0 .. GENERATED FROM PYTHON SOURCE LINES 28-29 KMeans-based split .. GENERATED FROM PYTHON SOURCE LINES 29-32 .. code-block:: default exp.data_prepare(target='cnt', task_type='regression', sample_weight=None, split_method='kmeans', test_ratio=[0.0, 1.0, 0.0], random_state=0) .. rst-class:: sphx-glr-script-out .. code-block:: none Config Value 0 Excluded columns [] 1 Target variable cnt 2 Sample weight None 3 Task type regression 4 Split method kmeans 5 Test ratio 0.420277 6 Random state 0 .. GENERATED FROM PYTHON SOURCE LINES 33-34 Custom split .. GENERATED FROM PYTHON SOURCE LINES 34-37 .. code-block:: default custom_train_idx = np.arange(0, 16000) custom_test_idx = np.arange(16000, 17379) exp.data_prepare(target='cnt', task_type='regression', sample_weight=None, train_idx=custom_train_idx, test_idx=custom_test_idx) .. rst-class:: sphx-glr-script-out .. code-block:: none Config Value 0 Excluded columns [] 1 Target variable cnt 2 Sample weight None 3 Task type regression 4 Split method manual 5 Test ratio 0.079349 6 Random state 0 .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 35.172 seconds) **Estimated memory usage:** 21 MB .. _sphx_glr_download_auto_examples_0_data_plot_3_data_prepare.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/selfexplainml/piml-toolbox/main?urlpath=lab/tree/./docs/_build/html/notebooks/auto_examples/0_data/plot_3_data_prepare.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_3_data_prepare.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_3_data_prepare.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_