- The Data Analysis Workshop
- Gururajan Govindan Shubhangi Hora Konstantin Palagachev
- 723字
- 2021-06-18 18:18:26
Body Mass Index
The Body Mass Index (BMI) is defined as a person's weight in kilograms, pided by the square of their height in meters:

Figure 2.28: Expression for BMI
BMI is a universal way to classify people as underweight, healthy weight, overweight, and obese, based on tissue mass (muscle, fat, and bone) and height. The following plot indicates the relationship between weight and height for the various categories:

Figure 2.29: Body Mass Index categories (source: https://en.wikipedia.org/wiki/Body_mass_index)
According to the preceding plot, we can build the four categories (underweight, healthy weight, overweight, and obese) based on the BMI values:
"""
define function for computing the BMI category, based on BMI value
"""
def get_bmi_category(bmi):
if bmi < 18.5:
category = "underweight"
elif bmi >= 18.5 and bmi < 25:
category = "healthy weight"
elif bmi >= 25 and bmi < 30:
category = "overweight"
else:
category = "obese"
return category
# compute BMI category
preprocessed_data["BMI category"] = preprocessed_data\
["Body mass index"]\
.apply(get_bmi_category)
We can plot the number of entries for each category:
# plot number of entries for each category
plt.figure(figsize=(10, 6))
sns.countplot(data=preprocessed_data, x='BMI category', \
order=["underweight", "healthy weight", \
"overweight", "obese"], \
palette="Set2")
plt.savefig('figs/bmi_categories.png', format='png', dpi=300)
The following is the output of the preceding code:

Figure 2.30: BMI categories
We can see that no entries for the underweight category are present, with the data being almost uniformly distributed among the remaining three categories. Of course, this is an alarming indicator, as more than 60% of the employees are either overweight or obese.
Now, let's check how the different BMI categories are related to the reason for absence. More precisely, we would like to see how many employees there are based on their body mass index and their reason for absence. This can be done with the following code:
# plot BMI categories vs Reason for absence
plt.figure(figsize=(10, 16))
ax = sns.countplot(data=preprocessed_data, \
y="Reason for absence", hue="BMI category", \
hue_order=["underweight", "healthy weight", \
"overweight", "obese"], \
palette="Set2")
ax.set_xlabel("Number of employees")
plt.savefig('figs/reasons_bmi.png', format='png', dpi=300)
The output will be as follows:

Figure 2.31: Absence reasons, based on BMI category
Unfortunately, no clear pattern arises from the preceding plot. In other words, for each reason for absence, an (almost) equal number of employees with different body mass indexes are present.
We can also investigate the distribution of absence hours for the different BMI categories:
# plot distribution of absence time, based on BMI category
plt.figure(figsize=(8,6))
sns.violinplot(x="BMI category", \
y="Absenteeism time in hours", \
data=preprocessed_data, \
order=["healthy weight", "overweight", "obese"])
plt.savefig('figs/bmi_hour_distribution.png', format='png')
The output will be as follows:

Figure 2.32: Absence time in hours, based on the BMI category
As we can observe from Figure 2.31 and Figure 2.32, no evidence states that BMI and obesity levels influence the employees' absenteeism.
- MySQL數(shù)據(jù)庫(kù)應(yīng)用與管理 第2版
- Java高并發(fā)核心編程(卷2):多線程、鎖、JMM、JUC、高并發(fā)設(shè)計(jì)模式
- INSTANT Sencha Touch
- Python計(jì)算機(jī)視覺(jué)編程
- 數(shù)據(jù)結(jié)構(gòu)(C語(yǔ)言)
- 微信小程序開(kāi)發(fā)解析
- Java程序設(shè)計(jì)基礎(chǔ)(第6版)
- Java Hibernate Cookbook
- JavaScript設(shè)計(jì)模式與開(kāi)發(fā)實(shí)踐
- 數(shù)據(jù)庫(kù)技術(shù)及應(yīng)用教程上機(jī)指導(dǎo)與習(xí)題(第2版)
- Java EE 程序設(shè)計(jì)
- Learning VMware vCloud Air
- ASP.NET MVC 4 Mobile App Development
- IPython Notebook Essentials
- HTML5+CSS3從入門(mén)到精通(微課精編版)