Let's get started with data visualization. We will plot histograms for each variable. The steps in the preceding section are important, because we need to understand these datasets if we want to accurately and effectively use machine learning. Otherwise, we're shooting in the dark, and we might spend time on a method that doesn't need to be investigated. We will use the plt method and make a plot, in which we will add the histograms of our dataset and edit the figure sizes, to make them easier to see.
We can see the output in the following screenshot:
As you can see, most of the preceding histograms have the majority of their data at around1, with some data at a slightly higher value. Each histogram, apart fromclass, has at least one case where the value is10. The histogram forclump thicknessis pretty evenly distributed, while the histogram forchromatinis skewed to the left.