官术网_书友最值得收藏!

The Trainer class

In the Trainer class, we will build a new pipeline to train our model. The FeaturizeText transform builds NGrams from the strings data we previously extracted from the files. NGrams are a popular method to create vectors from a string to, in turn, feed the model. You can think of NGrams as breaking a longer string into ranges of characters based on the value of the NGram parameter. A bi-gram, for instance, would take the following sentence, ML.NET is great and convert it into ML-.N-ET-is-gr-ea-t. Lastly, we build the SdcaLogisticRegression trainer object:

var dataProcessPipeline = MlContext.Transforms.CopyColumns("Label", nameof(FileInput.Label))
.Append(MlContext.Transforms.Text.FeaturizeText("NGrams", nameof(FileInput.Strings)))
.Append(MlContext.Transforms.Concatenate("Features", "NGrams"));

var trainer = MlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Label", featureColumnName: "Features");
For those looking to deep dive further into the Transforms Catalog API, check out the documentation from Microsoft here: https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transformscatalog?view=ml-dotnet.
主站蜘蛛池模板: 玉山县| 大埔区| 喀喇| 泽库县| 元朗区| 香格里拉县| 常州市| 光泽县| 五河县| 闸北区| 房山区| 华宁县| 佛山市| 金华市| 双辽市| 贵定县| 绥阳县| 全南县| 颍上县| 日土县| 敦煌市| 揭阳市| 射阳县| 若尔盖县| 葫芦岛市| 米脂县| 兰溪市| 武夷山市| 田东县| 盐边县| 留坝县| 黔西| 北安市| 临泉县| 巧家县| 盐山县| 武强县| 马边| 独山县| 谷城县| 新蔡县|