書名： Hands-On Machine Learning with ML.NET
作者名： Jarred Capellman
本章字數： 131字
更新時間： 2021-06-24 16:43:35

The Trainer class

In the Trainer class, we will build a new pipeline to train our model. The FeaturizeText transform builds NGrams from the strings data we previously extracted from the files. NGrams are a popular method to create vectors from a string to, in turn, feed the model. You can think of NGrams as breaking a longer string into ranges of characters based on the value of the NGram parameter. A bi-gram, for instance, would take the following sentence, ML.NET is great and convert it into ML-.N-ET-is-gr-ea-t. Lastly, we build the SdcaLogisticRegression trainer object:

var dataProcessPipeline = MlContext.Transforms.CopyColumns("Label", nameof(FileInput.Label))
 .Append(MlContext.Transforms.Text.FeaturizeText("NGrams", nameof(FileInput.Strings)))
 .Append(MlContext.Transforms.Concatenate("Features", "NGrams"));

var trainer = MlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Label", featureColumnName: "Features");

For those looking to deep dive further into the Transforms Catalog API, check out the documentation from Microsoft here: https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transformscatalog?view=ml-dotnet.

官术网_书友最值得收藏!

Hands-On Machine Learning with ML.NET

The Trainer class