官术网_书友最值得收藏!

The Trainer class

Inside the Trainer class, a large portion was rewritten to handle the expanded features used and to provide regression algorithm evaluation as opposed to the binary classification we looked at in Chapter 2Setting Up the ML.NET Environment.

The first change is the use of a comma to separate the data as opposed to the default tab like we used in Chapter 2Setting Up the ML.NET Environment:

var trainingDataView = MlContext.Data.LoadFromTextFile<EmploymentHistory>(trainingFileName, ',');

The next change is in the pipeline creation itself. In our first application, we had a label and fed that straight into the pipeline. With this application, we have nine features to predict the duration of a person's employment in the DurationInMonths property and append each one of them to the pipeline using the C# 6.0 feature, nameof. You might have noticed the use of magic strings to map class properties to features in various code samples on GitHub and MSDN; personally, I find this error-prone compared to the strongly typed approach.

For every property, we call the NormalizeMeanVariance transform method, which as the name implies normalizes the input data both on the mean and the variance. ML.NET computes this by subtracting the mean of the input data and dividing that value by the variance of the inputted data. The purpose behind this is to nullify outliers in the input data so the model isn't skewed to handle an edge case compared to the normal range. For example, suppose the sample dataset of employment history had 20 rows and all but one of those rows had a person with 50 years experience. The one row that didn't fit would be normalized to better fit within the ranges of values entered into the model.

In addition, note the use of the extension method referred to earlier to help to simplify the following code, when we concatenate all of the feature columns:

var dataProcessPipeline = MlContext.Transforms.CopyColumns("Label", nameof(EmploymentHistory.DurationInMonths))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.IsMarried)))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.BSDegree)))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.MSDegree)))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.YearsExperience))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.AgeAtHire)))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.HasKids)))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.WithinMonthOfVesting)))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.DeskDecorations)))
.Append(MlContext.Transforms.NormalizeMeanVariance(nameof(EmploymentHistory.LongCommute)))
.Append(MlContext.Transforms.Concatenate("Features",
typeof(EmploymentHistory).ToPropertyList<EmploymentHistory>(nameof(EmploymentHistory.DurationInMonths)))));

We can then create the Sdca trainer using the default parameters ("Label" and "Features"):

var trainer = MlContext.Regression.Trainers.Sdca(labelColumnName: "Label", featureColumnName: "Features");

Lastly, we call the Regression.Evaluate method to provide regression specific metrics, followed by a Console.WriteLine call to provide these metrics to your console output. We will go into detail about what each of these means in the last section of this chapter:

var modelMetrics = MlContext.Regression.Evaluate(testSetTransform);

Console.WriteLine($"Loss Function: {modelMetrics.LossFunction:0.##}{Environment.NewLine}" +
$"Mean Absolute Error: {modelMetrics.MeanAbsoluteError:#.##}{Environment.NewLine}" +
$"Mean Squared Error: {modelMetrics.MeanSquaredError:#.##}{Environment.NewLine}" +
$"RSquared: {modelMetrics.RSquared:0.##}{Environment.NewLine}" +
$"Root Mean Squared Error: {modelMetrics.RootMeanSquaredError:#.##}");
主站蜘蛛池模板: 汕头市| 兴安盟| 柞水县| 柳林县| 长沙县| 京山县| 泰来县| 诸暨市| 长治市| 海原县| 黑水县| 惠安县| 措勤县| 色达县| 丰顺县| 漯河市| 凤山市| 醴陵市| 乌兰察布市| 茌平县| 高清| 浦县| 清河县| 科技| 宜川县| 岑巩县| 哈尔滨市| 界首市| 贵州省| 虎林市| 广东省| 泸溪县| 衡水市| 诏安县| 高淳县| 虹口区| 翁牛特旗| 泰和县| 宜丰县| 揭阳市| 新田县|