Contents

stratifiedSplit(proportions:on:seed:)

Randomly split a MLDataTable into a number partitions while stratifying on a user-define label column.

Declaration

func stratifiedSplit(proportions: [Double], on column: String, seed: Int = timestampSeed()) throws -> MLDataTable

Parameters

  • proportions:

    An array of values on [0,1] specifying the proprtions in each partition. Automatically normalized to 1.

  • column:

    The column in an MLDataTable being stratified on.

  • seed:

    Seed for the random number generator used for splitting. The default seed is the current epoch time in milliseconds.

Return Value

A new MLDataTable with an additional partition column with the index of the partition for each row.

Discussion

The proportions specified will be applied uniformly to each label being partitioned on.

See Also

Splitting a data table