官术网_书友最值得收藏!

The machine learning algorithm using a distributed environment

Machine learning algorithms combine simple tasks into complex patterns, that are even more complicated in distributed environment. Let's take a simple decision tree algorithm (reference), for example. This particular algorithm creates a binary tree that tries to fit training data and minimize prediction errors. However, in order to do this, it has to decide about the branch of tree it has to send every data point to (don't worry, we'll cover the mechanics of how this algorithm works along with some very useful parameters that you can learn in later in the book). Let's demonstrate it with a simple example:

Figure 3 - Example of red and blue data points covering 2D space.

Consider the situation depicted in preceding figure. A two-dimensional board with many points colored in two colors: red and blue. The goal of the decision tree is to learn and generalize the shape of data and help decide about the color of a new point. In our example, we can easily see that the points almost follow a chessboard pattern. However, the algorithm has to figure out the structure by itself. It starts by finding the best position of a vertical or horizontal line, which would separate the red points from the blue points.

The found decision is stored in the tree root and the steps are recursively applied on both the partitions. The algorithm ends when there is a single point in the partition:

Figure 4 - The final decision tree and projection of its prediction to the original space of points.
主站蜘蛛池模板: 澎湖县| 老河口市| 隆尧县| 临清市| 宣汉县| 洪江市| 三亚市| 海丰县| 南陵县| 芦溪县| 万荣县| 洛宁县| 景泰县| 叶城县| 光泽县| 兴隆县| 荔浦县| 五河县| 博罗县| 晋州市| 河池市| 滦南县| 新巴尔虎右旗| 开原市| 隆昌县| 临城县| 萍乡市| 吐鲁番市| 灌南县| 独山县| 安新县| 庄河市| 宝清县| 逊克县| 宜章县| 宣威市| 南江县| 白朗县| 梁河县| 洛扎县| 噶尔县|