官术网_书友最值得收藏!

Relationships between variables

We will now look at a scatterplot matrix, to see the relationships between some of these variables. A scatterplot matrix is a very useful function to use, because it can tell us whether a linear classifier will be a good classifier for our data, or whether we have to investigate more complicated methods.

We will add a scatter_matrix method and adjust the size to figsize(18, 18), to make it easier to see.

The output, as shown in the following screenshot, indicates the relationship between each variable and every other variable:

All of the variables are listed on both the x and the y axes. Where they intersect, we can see the histograms that we saw previously.

In the block indicated by the mouse cursor in the preceding screenshot, we can see that there is a pretty strong linear relationship between uniform_cell_shape and uniform_cell_sizeThis is expected. When we go through the preceding screenshot, we can see that some other cells have a good linear relationship. If we look at our classifications, however, there's no easy way to classify these relationships.

In class in the preceding screenshot, we can see that 4 is a malignant classification. We can also see that there are cells that are scored from 1 to 10 on clump_thickness, and were still classified as malignant.

Thus, we come to the conclusion that there aren't any strong relationships between any of the variables of our dataset.

主站蜘蛛池模板: 五原县| 海兴县| 兴和县| 达州市| 南充市| 佛冈县| 新宁县| 济南市| 平湖市| 乌兰浩特市| 子长县| 红原县| 广安市| 嵊泗县| 宽甸| 阳朔县| 丹凤县| 陆丰市| 海伦市| 广元市| 广昌县| 永善县| 许昌市| 永济市| 确山县| 达州市| 内乡县| 无极县| 当雄县| 龙胜| 宁化县| 玉溪市| 顺义区| 颍上县| 清丰县| 叙永县| 姚安县| 永登县| 敖汉旗| 兴国县| 红安县|