官术网_书友最值得收藏!

Body Mass Index

The Body Mass Index (BMI) is defined as a person's weight in kilograms, pided by the square of their height in meters:

Figure 2.28: Expression for BMI

BMI is a universal way to classify people as underweight, healthy weight, overweight, and obese, based on tissue mass (muscle, fat, and bone) and height. The following plot indicates the relationship between weight and height for the various categories:

Figure 2.29: Body Mass Index categories (source: https://en.wikipedia.org/wiki/Body_mass_index)

According to the preceding plot, we can build the four categories (underweight, healthy weight, overweight, and obese) based on the BMI values:

"""

define function for computing the BMI category, based on BMI value

"""

def get_bmi_category(bmi):

    if bmi < 18.5:

        category = "underweight"

    elif bmi >= 18.5 and bmi < 25:

        category = "healthy weight"

    elif bmi >= 25 and bmi < 30:

        category = "overweight"

    else:

        category = "obese"

    return category

# compute BMI category

preprocessed_data["BMI category"] = preprocessed_data\

                                    ["Body mass index"]\

                                    .apply(get_bmi_category)

We can plot the number of entries for each category:

# plot number of entries for each category

plt.figure(figsize=(10, 6))

sns.countplot(data=preprocessed_data, x='BMI category', \

              order=["underweight", "healthy weight", \

                     "overweight", "obese"], \

              palette="Set2")

plt.savefig('figs/bmi_categories.png', format='png', dpi=300)

The following is the output of the preceding code:

Figure 2.30: BMI categories

We can see that no entries for the underweight category are present, with the data being almost uniformly distributed among the remaining three categories. Of course, this is an alarming indicator, as more than 60% of the employees are either overweight or obese.

Now, let's check how the different BMI categories are related to the reason for absence. More precisely, we would like to see how many employees there are based on their body mass index and their reason for absence. This can be done with the following code:

# plot BMI categories vs Reason for absence

plt.figure(figsize=(10, 16))

ax = sns.countplot(data=preprocessed_data, \

                   y="Reason for absence", hue="BMI category", \

                   hue_order=["underweight", "healthy weight", \

                              "overweight", "obese"], \

                   palette="Set2")

ax.set_xlabel("Number of employees")

plt.savefig('figs/reasons_bmi.png', format='png', dpi=300)

The output will be as follows:

Figure 2.31: Absence reasons, based on BMI category

Unfortunately, no clear pattern arises from the preceding plot. In other words, for each reason for absence, an (almost) equal number of employees with different body mass indexes are present.

We can also investigate the distribution of absence hours for the different BMI categories:

# plot distribution of absence time, based on BMI category

plt.figure(figsize=(8,6))

sns.violinplot(x="BMI category", \

               y="Absenteeism time in hours", \

               data=preprocessed_data, \

               order=["healthy weight", "overweight", "obese"])

plt.savefig('figs/bmi_hour_distribution.png', format='png')

The output will be as follows:

Figure 2.32: Absence time in hours, based on the BMI category

As we can observe from Figure 2.31 and Figure 2.32, no evidence states that BMI and obesity levels influence the employees' absenteeism.

主站蜘蛛池模板: 武穴市| 建阳市| 和政县| 三河市| 水富县| 衡东县| 黄龙县| 石首市| 黄陵县| 湖州市| 涿州市| 芮城县| 萍乡市| 乌恰县| 宁安市| 荣昌县| 酉阳| 福安市| 乐昌市| 灵台县| 巴林左旗| 南岸区| 鲁甸县| 萨迦县| 蓬溪县| 岳阳县| 金华市| 长治市| 甘孜县| 仙桃市| 铜山县| 故城县| 兴城市| 秦皇岛市| 富裕县| 汾阳市| 兰州市| 株洲县| 兴隆县| 开平市| 吴忠市|