stratifiedSplit(proportions:on:seed:)
Randomly split a MLDataTable into a number partitions while stratifying on a user-define label column.
Declaration
func stratifiedSplit(proportions: [Double], on column: String, seed: Int = timestampSeed()) throws -> MLDataTableParameters
- proportions:
An array of values on [0,1] specifying the proprtions in each partition. Automatically normalized to 1.
- column:
The column in an MLDataTable being stratified on.
- seed:
Seed for the random number generator used for splitting. The default seed is the current epoch time in milliseconds.
Return Value
A new MLDataTable with an additional partition column with the index of the partition for each row.
Discussion
The proportions specified will be applied uniformly to each label being partitioned on.