Contents

stratifiedSplitBySequence(proportions:by:on:generator:)

Randomly split a MLDataTable into partitions on a user-define label column, while keeping rows from the same sequence in the original order.

Declaration

func stratifiedSplitBySequence<RNG>(proportions: [Double], by sequenceIdentifierColumn: String, on column: String, generator: inout RNG) throws -> MLDataTable where RNG : RandomNumberGenerator

Parameters

  • proportions:

    An array of values on [0,1] specifying the proprtions in each partition. Automatically normalized to 1.

  • sequenceIdentifierColumn:

    The sequence identifier column in an MLDataTable to identify rows of a sequence.

  • column:

    The column in an MLDataTable being stratified on.

  • generator:

    User-defined RandomNumberGenerator to use in stratification.

Return Value

A new MLDataTable with an additional partition column with the index of the partition for each row.

Discussion

The proportions specified will be applied uniformly to each label being partitioned on.

See Also

Splitting a data table