官术网_书友最值得收藏!

Types of Graphs and When to Use Them

Every analysis, whether on small or large datasets, involves a descriptive statistics step, where the data is summarized and described by statistics such as mean, median, percentages, and correlation. This step is commonly the first step in the analysis workflow, allowing a preliminary understanding of the data and its general patterns and behaviors, providing grounds for the analyst to formulate hypotheses, and directing the next steps in the analysis. Graphs are powerful tools to aid in this step, enabling the analyst to visualize the data, create new views and concepts, and communicate them to a larger audience.

There is a vast amount of literature on statistics about visualizing information. The classic book, Envisioning Information, by Edward Tufte, demonstrates beautiful and useful examples of how to present information in graphical form. In another book, The Visual Display of Quantitative Information, Tufte enumerates a few qualities that a graph that will be used for analysis and transmitting information, including statistics, should have:

  • Show the data
  • Avoid distorting what the data has to say
  • Make large datasets coherent
  • Serve a reasonably clear purpose—description, exploration, tabulation, or decoration

Graphs must reveal information. We should think about creating graphs with these principles in mind when creating an analysis.

A graph should also be able to stand out on its own, outside the analysis. Let's say that you are writing an analysis report that becomes extensive. Now, we need to create a summary of that extensive analysis. To make the analysis' points clear, a graph can be used to represent the data. This graph should be able to support the summary without the entire extensive analysis. To enable the graph to give more information and be able to stand out on its own in the summary, we have to add more information to it, such as a title and labels.

Exercise 8: Plotting an Analytical Function

In this exercise, we will create a basic plot using the Matplotlib libraries, where we will visualize a function of two variables, for example, y = f(x), where f(x) is x^2:

  1. First, create a new Jupyter notebook and import all the required libraries:

    %matplotlib inline

    import pandas as pd

    import numpy as np

    import matplotlib as mpl

    import matplotlib.pyplot as plt

  2. Now, let's generate a dataset and plot it using the following code:

    x = np.linspace(-50, 50, 100)

    y = np.power(x, 2)

  3. Use the following command to create a basic graph with Matplotlib:

    plt.plot(x, y)

    The output is as follows:

    Figure 2.1: Basic plot of X and Y axis

  4. Now, modify the data generation function from x^2 to x^3, keeping the same interval of [-50,50] and recreate the line plot:

    y_hat = np.power(x, 3)

    plt.plot(x, y_hat)

    The output is as follows:

Figure 2.2: Basic plot of X and Y axis

As you can see, the shape of the function changed, as expected. The basic type of graph that we used was sufficient to see the change between the y and y_hat values. But some questions remain: we plotted only a mathematical function, but generally the data that we are collecting has dimensions, such as length, time, and mass. How can we add this information to the plot? How do we add a title? Let's explore this in the next section.

主站蜘蛛池模板: 普洱| 常德市| 威信县| 兴安县| 大理市| 漳平市| 育儿| 武隆县| 恭城| 青浦区| 祁东县| 阳江市| 三江| 伊金霍洛旗| 会泽县| 光泽县| 宜春市| 长葛市| 东阿县| 屏东市| 侯马市| 涞源县| 陈巴尔虎旗| 友谊县| 通河县| 和田县| 若尔盖县| 凤翔县| 谷城县| 仙居县| 安平县| 平昌县| 驻马店市| 富裕县| 太湖县| 洱源县| 高唐县| 阳泉市| 广东省| 丰宁| 双峰县|