官术网_书友最值得收藏!

Types of Graphs and When to Use Them

Every analysis, whether on small or large datasets, involves a descriptive statistics step, where the data is summarized and described by statistics such as mean, median, percentages, and correlation. This step is commonly the first step in the analysis workflow, allowing a preliminary understanding of the data and its general patterns and behaviors, providing grounds for the analyst to formulate hypotheses, and directing the next steps in the analysis. Graphs are powerful tools to aid in this step, enabling the analyst to visualize the data, create new views and concepts, and communicate them to a larger audience.

There is a vast amount of literature on statistics about visualizing information. The classic book, Envisioning Information, by Edward Tufte, demonstrates beautiful and useful examples of how to present information in graphical form. In another book, The Visual Display of Quantitative Information, Tufte enumerates a few qualities that a graph that will be used for analysis and transmitting information, including statistics, should have:

  • Show the data
  • Avoid distorting what the data has to say
  • Make large datasets coherent
  • Serve a reasonably clear purpose—description, exploration, tabulation, or decoration

Graphs must reveal information. We should think about creating graphs with these principles in mind when creating an analysis.

A graph should also be able to stand out on its own, outside the analysis. Let's say that you are writing an analysis report that becomes extensive. Now, we need to create a summary of that extensive analysis. To make the analysis' points clear, a graph can be used to represent the data. This graph should be able to support the summary without the entire extensive analysis. To enable the graph to give more information and be able to stand out on its own in the summary, we have to add more information to it, such as a title and labels.

Exercise 8: Plotting an Analytical Function

In this exercise, we will create a basic plot using the Matplotlib libraries, where we will visualize a function of two variables, for example, y = f(x), where f(x) is x^2:

  1. First, create a new Jupyter notebook and import all the required libraries:

    %matplotlib inline

    import pandas as pd

    import numpy as np

    import matplotlib as mpl

    import matplotlib.pyplot as plt

  2. Now, let's generate a dataset and plot it using the following code:

    x = np.linspace(-50, 50, 100)

    y = np.power(x, 2)

  3. Use the following command to create a basic graph with Matplotlib:

    plt.plot(x, y)

    The output is as follows:

    Figure 2.1: Basic plot of X and Y axis

  4. Now, modify the data generation function from x^2 to x^3, keeping the same interval of [-50,50] and recreate the line plot:

    y_hat = np.power(x, 3)

    plt.plot(x, y_hat)

    The output is as follows:

Figure 2.2: Basic plot of X and Y axis

As you can see, the shape of the function changed, as expected. The basic type of graph that we used was sufficient to see the change between the y and y_hat values. But some questions remain: we plotted only a mathematical function, but generally the data that we are collecting has dimensions, such as length, time, and mass. How can we add this information to the plot? How do we add a title? Let's explore this in the next section.

主站蜘蛛池模板: 渝中区| 阿尔山市| 儋州市| 十堰市| 合山市| 鄂伦春自治旗| 潼南县| 海阳市| 青海省| 桦甸市| 五莲县| 聂拉木县| 清水县| 炉霍县| 九龙县| 柘荣县| 溧水县| 独山县| 太仓市| 伊宁县| 永春县| 景谷| 汉寿县| 资兴市| 嘉义市| 丁青县| 昌黎县| 长泰县| 湖北省| 吉木乃县| 扎兰屯市| 开原市| 涞水县| 大名县| 锡林浩特市| 余干县| 平舆县| 新化县| 始兴县| 武冈市| 阿拉善左旗|