官术网_书友最值得收藏!

Body Mass Index

The Body Mass Index (BMI) is defined as a person's weight in kilograms, pided by the square of their height in meters:

Figure 2.28: Expression for BMI

BMI is a universal way to classify people as underweight, healthy weight, overweight, and obese, based on tissue mass (muscle, fat, and bone) and height. The following plot indicates the relationship between weight and height for the various categories:

Figure 2.29: Body Mass Index categories (source: https://en.wikipedia.org/wiki/Body_mass_index)

According to the preceding plot, we can build the four categories (underweight, healthy weight, overweight, and obese) based on the BMI values:

"""

define function for computing the BMI category, based on BMI value

"""

def get_bmi_category(bmi):

    if bmi < 18.5:

        category = "underweight"

    elif bmi >= 18.5 and bmi < 25:

        category = "healthy weight"

    elif bmi >= 25 and bmi < 30:

        category = "overweight"

    else:

        category = "obese"

    return category

# compute BMI category

preprocessed_data["BMI category"] = preprocessed_data\

                                    ["Body mass index"]\

                                    .apply(get_bmi_category)

We can plot the number of entries for each category:

# plot number of entries for each category

plt.figure(figsize=(10, 6))

sns.countplot(data=preprocessed_data, x='BMI category', \

              order=["underweight", "healthy weight", \

                     "overweight", "obese"], \

              palette="Set2")

plt.savefig('figs/bmi_categories.png', format='png', dpi=300)

The following is the output of the preceding code:

Figure 2.30: BMI categories

We can see that no entries for the underweight category are present, with the data being almost uniformly distributed among the remaining three categories. Of course, this is an alarming indicator, as more than 60% of the employees are either overweight or obese.

Now, let's check how the different BMI categories are related to the reason for absence. More precisely, we would like to see how many employees there are based on their body mass index and their reason for absence. This can be done with the following code:

# plot BMI categories vs Reason for absence

plt.figure(figsize=(10, 16))

ax = sns.countplot(data=preprocessed_data, \

                   y="Reason for absence", hue="BMI category", \

                   hue_order=["underweight", "healthy weight", \

                              "overweight", "obese"], \

                   palette="Set2")

ax.set_xlabel("Number of employees")

plt.savefig('figs/reasons_bmi.png', format='png', dpi=300)

The output will be as follows:

Figure 2.31: Absence reasons, based on BMI category

Unfortunately, no clear pattern arises from the preceding plot. In other words, for each reason for absence, an (almost) equal number of employees with different body mass indexes are present.

We can also investigate the distribution of absence hours for the different BMI categories:

# plot distribution of absence time, based on BMI category

plt.figure(figsize=(8,6))

sns.violinplot(x="BMI category", \

               y="Absenteeism time in hours", \

               data=preprocessed_data, \

               order=["healthy weight", "overweight", "obese"])

plt.savefig('figs/bmi_hour_distribution.png', format='png')

The output will be as follows:

Figure 2.32: Absence time in hours, based on the BMI category

As we can observe from Figure 2.31 and Figure 2.32, no evidence states that BMI and obesity levels influence the employees' absenteeism.

主站蜘蛛池模板: 榆林市| 措勤县| 岑溪市| 武陟县| 雷山县| 遂溪县| 金华市| 天长市| 黎川县| 太白县| 田阳县| 二连浩特市| 五原县| 古蔺县| 奉新县| 司法| 宣城市| 保定市| 盖州市| 三门峡市| 宣城市| 团风县| 竹山县| 丰台区| 甘泉县| 临安市| 抚松县| 若羌县| 阳泉市| 通道| 黔江区| 宜阳县| 应城市| 长子县| 竹北市| 南安市| 平谷区| 南通市| 永城市| 大洼县| 永安市|