官术网_书友最值得收藏!

What is a good model?

A model is a bunch of values that describe the world, alongside the program that produces these values. That much, we have concluded from the previous section. Now we have to pass some value judgments on models - whether a model is good or bad.

A good model needs to describe the world accurately. This is said in the most generic way possible. Described thus, this statement about a good model encompasses many notions. We shall have to make this abstract idea a bit more concrete to proceed.

A machine learning algorithm is trained on a bunch of data. To the machine, this bunch of data is the world. But to us, the data that we feed the machine in for training is not the world. To us humans, there is much more to the world than what the machine may know about. So when I say "a good model needs to describe the world accurately", there are two senses to the word "world" that applies - the world as the machine knows, and the world as we know it.

The machine has only seen portions of the world as we know it. There are parts of the world the machine has not seen. So it is then a good machine learning model when it is able to provide the correct outputs for inputs it has not seen yet.

As a concrete example, let's once again suppose that we have a machine learning algorithm that determines if an image is that of a hot dog or not. We feed the model images of hot dogs and hamburgers. To the machine, the world are simply images of hot dogs and hamburgers. What happens when we pass in as input, an image of vegetables? A good model would be able to generalize and say it's not a hot dog. A poor model would simply crash.

And thus with this analogy, we have defined a good model to be one that generalizes well to unknown situations.

Often, as part of the process of building machine learning systems, we would want to put this notion to test. So we would have to split our dataset into testing and training datasets. The machine would be trained on the training dataset, and to test how good the model is once the training has completed, we will then feed in the testing dataset to the machine. It's assumed of course that the machine has never seen the testing dataset. A good machine learning model hence would be able to generate the correct output for the testing dataset, despite never having seen it.

主站蜘蛛池模板: 富平县| 偏关县| 达拉特旗| 安图县| 万盛区| 霍城县| 乌海市| 福安市| 永福县| 隆回县| 田阳县| 威远县| 利津县| 清水河县| 平乡县| 聂拉木县| 陵川县| 和田县| 余庆县| 汤阴县| 湘西| 兴文县| 菏泽市| 昂仁县| 修水县| 和林格尔县| 双江| 龙海市| 和田市| 原阳县| 二手房| 竹北市| 阿图什市| 汝南县| 炎陵县| 靖西县| 屯门区| 齐齐哈尔市| 洛隆县| 延庆县| 临清市|