MLDataTable
A table of data for training or evaluating a machine learning model.
Declaration
struct MLDataTableMentioned in
Overview
MLDataTable is Create ML’s version of a spreadsheet in which each row represents an entity (such as a book, in the example below) with observable features. Each column (MLDataColumn or MLUntypedColumn) in the table represents an observable feature of that entity, such as a book’s title or author.
[Image]
In most cases you interact with columns using the typed MLDataColumn, especially when you need to directly access the contents of a column. You can also interact with columns using MLUntypedColumn, if the underlying type of the column isn’t important.
After you create a data table, you can modify it with methods like append(contentsOf:), addColumn(_:named:), and removeColumn(named:). You can also filter or map the contents of the data table to derive new data tables or new columns by using various subscripts and methods like dropDuplicates() or map(_:).
Finally, when your data table is ready, use it to train and evaluate a model from these groups:
Regressors like MLRegressor and its supporting types
Classifiers like MLClassifier and its supporting types
Natural language processing types like MLTextClassifier and MLWordTagger
Topics
Creating a data table
Creating a model from tabular datainit(contentsOf:options:)init(dictionary:)init(namedColumns:)init()MLDataTable.ParsingOptions
Getting the size of a data table
Transforming rows to generate a data column
Adding columns
Accessing columns
Renaming columns
Removing columns
Appending to a data table
Generating new data tables
Splitting a data table
randomSplitBySequence(proportion:by:on:seed:)stratifiedSplit(proportions:on:generator:)stratifiedSplit(proportions:on:seed:)stratifiedSplitBySequence(proportions:by:on:generator:)stratifiedSplitBySequence(proportions:by:on:seed:)