官术网_书友最值得收藏!

Types of Graphs and When to Use Them

Every analysis, whether on small or large datasets, involves a descriptive statistics step, where the data is summarized and described by statistics such as mean, median, percentages, and correlation. This step is commonly the first step in the analysis workflow, allowing a preliminary understanding of the data and its general patterns and behaviors, providing grounds for the analyst to formulate hypotheses, and directing the next steps in the analysis. Graphs are powerful tools to aid in this step, enabling the analyst to visualize the data, create new views and concepts, and communicate them to a larger audience.

There is a vast amount of literature on statistics about visualizing information. The classic book, Envisioning Information, by Edward Tufte, demonstrates beautiful and useful examples of how to present information in graphical form. In another book, The Visual Display of Quantitative Information, Tufte enumerates a few qualities that a graph that will be used for analysis and transmitting information, including statistics, should have:

  • Show the data
  • Avoid distorting what the data has to say
  • Make large datasets coherent
  • Serve a reasonably clear purpose—description, exploration, tabulation, or decoration

Graphs must reveal information. We should think about creating graphs with these principles in mind when creating an analysis.

A graph should also be able to stand out on its own, outside the analysis. Let's say that you are writing an analysis report that becomes extensive. Now, we need to create a summary of that extensive analysis. To make the analysis' points clear, a graph can be used to represent the data. This graph should be able to support the summary without the entire extensive analysis. To enable the graph to give more information and be able to stand out on its own in the summary, we have to add more information to it, such as a title and labels.

Exercise 8: Plotting an Analytical Function

In this exercise, we will create a basic plot using the Matplotlib libraries, where we will visualize a function of two variables, for example, y = f(x), where f(x) is x^2:

  1. First, create a new Jupyter notebook and import all the required libraries:

    %matplotlib inline

    import pandas as pd

    import numpy as np

    import matplotlib as mpl

    import matplotlib.pyplot as plt

  2. Now, let's generate a dataset and plot it using the following code:

    x = np.linspace(-50, 50, 100)

    y = np.power(x, 2)

  3. Use the following command to create a basic graph with Matplotlib:

    plt.plot(x, y)

    The output is as follows:

    Figure 2.1: Basic plot of X and Y axis

  4. Now, modify the data generation function from x^2 to x^3, keeping the same interval of [-50,50] and recreate the line plot:

    y_hat = np.power(x, 3)

    plt.plot(x, y_hat)

    The output is as follows:

Figure 2.2: Basic plot of X and Y axis

As you can see, the shape of the function changed, as expected. The basic type of graph that we used was sufficient to see the change between the y and y_hat values. But some questions remain: we plotted only a mathematical function, but generally the data that we are collecting has dimensions, such as length, time, and mass. How can we add this information to the plot? How do we add a title? Let's explore this in the next section.

主站蜘蛛池模板: 原阳县| 偃师市| 吉安市| 寻甸| 霍林郭勒市| 新安县| 常熟市| 芮城县| 德保县| 资溪县| 遵化市| 集安市| 新干县| 石家庄市| 勐海县| 临清市| 恭城| 丹巴县| 吴桥县| 青州市| 太保市| 鹤峰县| 潼关县| 嵩明县| 普洱| 永州市| 白水县| 邮箱| 千阳县| 汝州市| 封开县| 合山市| 双流县| 边坝县| 博客| 肥乡县| 呼玛县| 阿城市| 靖边县| 交城县| 壶关县|