官术网_书友最值得收藏!

The Trainer class

In the Trainer class, we will build a new pipeline to train our model. The FeaturizeText transform builds NGrams from the strings data we previously extracted from the files. NGrams are a popular method to create vectors from a string to, in turn, feed the model. You can think of NGrams as breaking a longer string into ranges of characters based on the value of the NGram parameter. A bi-gram, for instance, would take the following sentence, ML.NET is great and convert it into ML-.N-ET-is-gr-ea-t. Lastly, we build the SdcaLogisticRegression trainer object:

var dataProcessPipeline = MlContext.Transforms.CopyColumns("Label", nameof(FileInput.Label))
.Append(MlContext.Transforms.Text.FeaturizeText("NGrams", nameof(FileInput.Strings)))
.Append(MlContext.Transforms.Concatenate("Features", "NGrams"));

var trainer = MlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Label", featureColumnName: "Features");
For those looking to deep dive further into the Transforms Catalog API, check out the documentation from Microsoft here: https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transformscatalog?view=ml-dotnet.
主站蜘蛛池模板: 玉溪市| 罗山县| 申扎县| 怀集县| 长白| 辽宁省| 九江县| 鄄城县| 玛沁县| 峨边| 莎车县| 汶上县| 安庆市| 清苑县| 新津县| 分宜县| 井陉县| 浪卡子县| 花垣县| 崇礼县| 成都市| 体育| 忻城县| 台山市| 阿巴嘎旗| 黄大仙区| 新安县| 堆龙德庆县| 兴安县| 时尚| 蕲春县| 潼关县| 子洲县| 新和县| 西乌| 武强县| 镇赉县| 曲阜市| 奉新县| 错那县| 三穗县|