- Hands-On Machine Learning with ML.NET
- Jarred Capellman
- 131字
- 2021-06-24 16:43:35
The Trainer class
In the Trainer class, we will build a new pipeline to train our model. The FeaturizeText transform builds NGrams from the strings data we previously extracted from the files. NGrams are a popular method to create vectors from a string to, in turn, feed the model. You can think of NGrams as breaking a longer string into ranges of characters based on the value of the NGram parameter. A bi-gram, for instance, would take the following sentence, ML.NET is great and convert it into ML-.N-ET-is-gr-ea-t. Lastly, we build the SdcaLogisticRegression trainer object:
var dataProcessPipeline = MlContext.Transforms.CopyColumns("Label", nameof(FileInput.Label))
.Append(MlContext.Transforms.Text.FeaturizeText("NGrams", nameof(FileInput.Strings)))
.Append(MlContext.Transforms.Concatenate("Features", "NGrams"));
var trainer = MlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Label", featureColumnName: "Features");
For those looking to deep dive further into the Transforms Catalog API, check out the documentation from Microsoft here: https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transformscatalog?view=ml-dotnet.
推薦閱讀
- Learning Java Functional Programming
- Expert C++
- Git Version Control Cookbook
- Python機器學(xué)習(xí):數(shù)據(jù)分析與評分卡建模(微課版)
- 自制編譯器
- Mastering Drupal 8 Views
- Mastering ROS for Robotics Programming
- Processing創(chuàng)意編程指南
- Continuous Delivery and DevOps:A Quickstart Guide Second Edition
- Spark技術(shù)內(nèi)幕:深入解析Spark內(nèi)核架構(gòu)設(shè)計與實現(xiàn)原理
- 網(wǎng)頁設(shè)計與制作
- 精益軟件開發(fā)管理之道
- JavaScript Unit Testing
- Distributed Computing with Python
- 詩意的邊緣