stratifiedSplitBySequence(proportions:by:on:generator:)
Randomly split a MLDataTable into partitions on a user-define label column, while keeping rows from the same sequence in the original order.
Declaration
func stratifiedSplitBySequence<RNG>(proportions: [Double], by sequenceIdentifierColumn: String, on column: String, generator: inout RNG) throws -> MLDataTable where RNG : RandomNumberGeneratorParameters
- proportions:
An array of values on [0,1] specifying the proprtions in each partition. Automatically normalized to 1.
- sequenceIdentifierColumn:
The sequence identifier column in an MLDataTable to identify rows of a sequence.
- column:
The column in an MLDataTable being stratified on.
- generator:
User-defined RandomNumberGenerator to use in stratification.
Return Value
A new MLDataTable with an additional partition column with the index of the partition for each row.
Discussion
The proportions specified will be applied uniformly to each label being partitioned on.