官术网_书友最值得收藏!

What is a good model?

A model is a bunch of values that describe the world, alongside the program that produces these values. That much, we have concluded from the previous section. Now we have to pass some value judgments on models - whether a model is good or bad.

A good model needs to describe the world accurately. This is said in the most generic way possible. Described thus, this statement about a good model encompasses many notions. We shall have to make this abstract idea a bit more concrete to proceed.

A machine learning algorithm is trained on a bunch of data. To the machine, this bunch of data is the world. But to us, the data that we feed the machine in for training is not the world. To us humans, there is much more to the world than what the machine may know about. So when I say "a good model needs to describe the world accurately", there are two senses to the word "world" that applies - the world as the machine knows, and the world as we know it.

The machine has only seen portions of the world as we know it. There are parts of the world the machine has not seen. So it is then a good machine learning model when it is able to provide the correct outputs for inputs it has not seen yet.

As a concrete example, let's once again suppose that we have a machine learning algorithm that determines if an image is that of a hot dog or not. We feed the model images of hot dogs and hamburgers. To the machine, the world are simply images of hot dogs and hamburgers. What happens when we pass in as input, an image of vegetables? A good model would be able to generalize and say it's not a hot dog. A poor model would simply crash.

And thus with this analogy, we have defined a good model to be one that generalizes well to unknown situations.

Often, as part of the process of building machine learning systems, we would want to put this notion to test. So we would have to split our dataset into testing and training datasets. The machine would be trained on the training dataset, and to test how good the model is once the training has completed, we will then feed in the testing dataset to the machine. It's assumed of course that the machine has never seen the testing dataset. A good machine learning model hence would be able to generate the correct output for the testing dataset, despite never having seen it.

主站蜘蛛池模板: 会昌县| 夏津县| 龙里县| 清徐县| 文昌市| 蒙城县| 临安市| 宁明县| 兴业县| 邳州市| 丰原市| 罗江县| 大连市| 黑河市| 鄱阳县| 北川| 泗洪县| 高陵县| 新巴尔虎左旗| 繁峙县| 师宗县| 文山县| 鄂伦春自治旗| 通海县| 昌都县| 奎屯市| 清原| 诏安县| 玉林市| 综艺| 九江县| 桐柏县| 饶平县| 富宁县| 日喀则市| 山丹县| 奉新县| 吴江市| 五大连池市| 青冈县| 榆中县|