官术网_书友最值得收藏!

Relationships between variables

We will now look at a scatterplot matrix, to see the relationships between some of these variables. A scatterplot matrix is a very useful function to use, because it can tell us whether a linear classifier will be a good classifier for our data, or whether we have to investigate more complicated methods.

We will add a scatter_matrix method and adjust the size to figsize(18, 18), to make it easier to see.

The output, as shown in the following screenshot, indicates the relationship between each variable and every other variable:

All of the variables are listed on both the x and the y axes. Where they intersect, we can see the histograms that we saw previously.

In the block indicated by the mouse cursor in the preceding screenshot, we can see that there is a pretty strong linear relationship between uniform_cell_shape and uniform_cell_sizeThis is expected. When we go through the preceding screenshot, we can see that some other cells have a good linear relationship. If we look at our classifications, however, there's no easy way to classify these relationships.

In class in the preceding screenshot, we can see that 4 is a malignant classification. We can also see that there are cells that are scored from 1 to 10 on clump_thickness, and were still classified as malignant.

Thus, we come to the conclusion that there aren't any strong relationships between any of the variables of our dataset.

主站蜘蛛池模板: 永登县| 樟树市| 北流市| 商城县| 教育| 文昌市| 怀柔区| 衡东县| 西乌珠穆沁旗| 政和县| 双桥区| 海兴县| 高要市| 工布江达县| 东台市| 青阳县| 肃南| 清流县| 宿松县| 崇义县| 永顺县| 永城市| 宜良县| 银川市| 岐山县| 惠安县| 丰都县| 大冶市| 上犹县| 顺昌县| 巴彦淖尔市| 宜阳县| 即墨市| 南木林县| 宁明县| 平陆县| 吴忠市| 济宁市| 囊谦县| 孝感市| 麻江县|