官术网_书友最值得收藏!

  • Big Data Analysis with Python
  • Ivan Marin Ankit Shukla Sarang VK
  • 350字
  • 2021-06-11 13:46:38

Visualization with Pandas

Pandas can be thought as a data Swiss Army knife, and one thing that a data scientist always needs when analyzing data is to visualize that data. We will go into detail on the kinds of plot that we can apply in an analysis. For now, the idea is to show how to do quick and dirty plots directly from pandas.

The plot function can be called directly from the DataFrame selection, allowing fast visualizations. A scatter plot can be created by using Matplotlib and passing data from the DataFrame to the plotting function. Now that we know the tools, let's focus on the pandas interface for data manipulation. This interface is so powerful that it is replicated by other projects that we will see in this course, such as Spark. We will explain the plot components and methods in more detail in the next chapter.

You will see how to create graphs that are useful for statistical analysis in the next chapter. Focus here on the mechanics of creating plots from pandas for quick visualizations.

Activity 3: Plotting Data with Pandas

To finish up our activity, let's redo all the previous steps and plot graphs with the results, as we would do in a preliminary analysis:

  1. Use the RadNet DataFrame that we have been working with.
  2. Fix all the data type problems, as we saw before.
  3. Create a plot with a filter per Location, selecting the city of San Bernardino, and one radionuclide, with the x-axis as date and the y-axis as radionuclide I-131:

    Figure 1.17: Plot of Location with I-131

  4. Create a scatter plot with the concentration of two related radionuclides, I-131 and I-132:

Figure 1.18: Plot of I-131 and I-132

Note

The solution for this activity can be found on page 203.

We are getting a bit ahead of ourselves here with the plotting, so we don't need to worry about the details of the plot or how we attribute titles, labels, and so on. The important takeaway here is understanding that we can plot directly from the DataFrame for quick analysis and visualization.

主站蜘蛛池模板: 繁峙县| 隆德县| 全州县| 泗水县| 出国| 曲阳县| 雅安市| 濮阳县| 东兴市| 丹阳市| 高雄市| 沂源县| 肇源县| 托里县| 民勤县| 旺苍县| 九龙坡区| 昭通市| 汉川市| 通化市| 澄城县| 郎溪县| 阜新市| 海晏县| 鹤岗市| 凤山县| 云林县| 固始县| 高清| 吉林省| 崇阳县| 东兰县| 巴塘县| 南开区| 夏邑县| 仪陇县| 安塞县| 迁安市| 松江区| 邵阳市| 贺州市|