Data Splitter module

cherrypick.splits.splitter(df, target: str, test_size: float) → tuple[tuple, tuple]

Split dataset into training and testing sets.

Parameters:

df (pandas.DataFrame) – Input dataset containing features and target variable.
target (str) – Column name of the target variable.
test_size (float) – Proportion of the dataset to include in the test split. Must be between 0.0 and 1.0.

Examples

>>> train, test = splitter(df= df, target=target_column, test_size=0.25)

Returns:

A tuple containing:

Return type:

tuple

Notes