官术网_书友最值得收藏!

Relationships between variables

We will now look at a scatterplot matrix, to see the relationships between some of these variables. A scatterplot matrix is a very useful function to use, because it can tell us whether a linear classifier will be a good classifier for our data, or whether we have to investigate more complicated methods.

We will add a scatter_matrix method and adjust the size to figsize(18, 18), to make it easier to see.

The output, as shown in the following screenshot, indicates the relationship between each variable and every other variable:

All of the variables are listed on both the x and the y axes. Where they intersect, we can see the histograms that we saw previously.

In the block indicated by the mouse cursor in the preceding screenshot, we can see that there is a pretty strong linear relationship between uniform_cell_shape and uniform_cell_sizeThis is expected. When we go through the preceding screenshot, we can see that some other cells have a good linear relationship. If we look at our classifications, however, there's no easy way to classify these relationships.

In class in the preceding screenshot, we can see that 4 is a malignant classification. We can also see that there are cells that are scored from 1 to 10 on clump_thickness, and were still classified as malignant.

Thus, we come to the conclusion that there aren't any strong relationships between any of the variables of our dataset.

主站蜘蛛池模板: 许昌县| 吉安县| 周宁县| 肥西县| 鹤岗市| 高雄县| 安乡县| 绥化市| 华亭县| 玉林市| 宁阳县| 佛坪县| 曲周县| 永福县| 吉首市| 镇原县| 五台县| 崇信县| 绥德县| 阳春市| 中江县| 抚州市| 郧西县| 双牌县| 湖南省| 沅江市| 垣曲县| 尼勒克县| 进贤县| 京山县| 康平县| 栾城县| 会理县| 鄄城县| 西华县| 蒲江县| 泸西县| 哈尔滨市| 武清区| 蕲春县| 通化市|