HR Analytics Overview: Absenteeism Project

This data analysis project was designed to showcase the concept of simulating a daily required task from the HR department to query and visualize the data on human resources with the wireframe, including the lifestyle of the employees—are they smokers or drinkers? to get the bonus program from the company.

Requirements from HR departments

  • Provide a list of Healthy Individuals & Low Absenteeism for the healthy bonus program (Total Budget: $1000 USD)
  • Calculate a Wage Increase or annual compensation for Non-Smokers (Insurance Budget: $983,221)
  • Create a Dashboard for HR to understand Absenteeism at work based on the approved wireframe.

Employee data can be found in the CSV files located in the my repository here: HR-dataset-project

Project Details

Start by creating a join table to combine all relevant datasets and ensure accurate employee information.

Scrollable Code Block

Input [1]:

SELECT * 
FROM Absenteeism_at_work a
LEFT JOIN compensation b
ON a.ID = b.ID
LEFT JOIN Reasons c
ON a.Reason_for_absence = c.Number;

Output [1]:

IDReason for absenceMonth of absenceDay of the weekSeasonsTransportation expenseDistance from Residence to WorkService timeAgeWork load Average/dayHit targetDisciplinary failureEducationSonSocial drinkerSocial smokerPetWeightHeightBody mass indexAbsenteeism time in hoursIDcomp_hrNumberReason
126731289361333239554970121019017230413526unjustified absence
2073111813185023955497111100981783102490Unkown
323741179511838239554970101008917031234723medical consultation
477512795143923955497012110681682444517Diseases of the eye and adnexa
...........................................................................
7371173123511143726460493031001881722947372611Diseases of the digestive system
73800311181413402712199501110898170340738480Unknown
739004223135143927121995012102100170350739420Unknown
74000631794514532712199510100177175250740550Unknown
740 rows × 25 columns


From this, we find the list of employees who are healthy individuals and have low absenteeism for the bonus program. With the assumption that healthy individuals are non-smokers and non-drinkers, their BMI is less than 25, and their absence time must be under the average of the list.

Input [2]:

SELECT * FROM Absenteeism_at_work a
LEFT JOIN compensation b
ON a.ID = b.ID
LEFT JOIN Reasons c
ON a.Reason_for_absence = c.Number
WHERE Social_drinker = 0 
AND Social_smoker = 0
AND Body_mass_index < 25
AND Absenteeism_time_in_hours < (SELECT AVG(Absenteeism_time_in_hours) FROM Absenteeism_at_work)
ORDER BY Absenteeism_time_in_hours;

Output [2]:

#IDReason for absenceMonth of absenceDay of the weekSeasonsTransportation expenseDistance from Residence to WorkService timeAgeWork load Average/dayHit targetDisciplinary failureEducationSonSocial drinkerSocial smokerPetWeightHeightBody mass indexAbsenteeism time in hoursIDcomp_hrNumberReason
152092422526928241476921110026916924052420Unknown
22170543388159503788849211000076178240217420Unknown
353101024225269282848539111100269169240531250Unknown
45591312641792693028054998030000561711915595113Diseases of the musculoskeletal system and connective tissue
..............................................................................
1085402311642252692826851993011002691692445404723Medical Consultation
10931511044179269302650178803000056171195315471Certain Infectious and Parasitic Diseases
11031671044179269302650178803000056171195316377Diseases of the Eye and Adnexa
1113592314222526928330061100011002691692453594923Medical Consultation
111 rows × 26 columns


Here we have the list of company employees who have a healthy lifestyle that matches the selection criteria for the bonus program. The output comes with 111 employees.

Requirement 2

Let’s move on to the next requirement from HR: calculating a wage increase or annual compensation for the insurance budget of $983,221 for all non-smokers

Note: the requirement is to calculate a wage increase or annual compensation for the insurance budget of $983,221 for ALL NON-SMOKER, it shall not be the same list as the previous requirement.

Firstly, query the number of employees who are nonsmokers:

Input [3]:

SELECT COUNT(*) AS count_nonsmoker 
FROM Absenteeism_at_work
WHERE Social_smoker = 0;

Output [3]:

count_nonsmoker
686

Calculate the total number of hours nonsmokers worked in a year if they worked 8 hours per day, 5 days a week, for 52 weeks: 686 x 8 x 5 x 52 = 1426880 (hours)

Next, calculate how much the hourly wage of a nonsmoker increases as the company's insurance budget increases by $983,221: 983221/1426880 = 0.68907 ($/hour)

Let's do it backward to see the cost in total for an employee per year after the company increases the insurance budget: 0.68907 x 8 x 5 x 52 = 1,433.26 ($/year)

The rise of 0.689 dollar per hour in an employee's hourly income may seem substantial, but a mere $1433 payment for a nonsmoker employee's insurance will have a significant impact on wellness and employee incentives. That's just too tiny for the company to encourage employees to get a better lifestyle for the next bonus program.

Requirement 3

Before creating dashboard sections, as you can see from the two previous requirements, we query the data by using SELECT * all the time, which makes sense for people who just check out for the first time and want complete information. For some columns that we don't need in PowerBI, optimize the query again using:

Input [4]:

SELECT 
	a.ID, 
	c.Reason,
	a.Month_of_absence,
	a.Body_mass_index,
CASE
	WHEN a.Body_mass_index < 18.5 THEN 'Underweight'
	WHEN a.Body_mass_index BETWEEN 18.5 AND 25 THEN 'Healthy'
	WHEN a.Body_mass_index BETWEEN 25 AND 30 THEN 'Overweight'
	WHEN a.Body_mass_index > 40 THEN 'Obese'
	ELSE 'Unknown'
END AS BMI_Caretory,
CASE 
	WHEN Month_of_absence IN (12,1,2) THEN 'Winter'
	WHEN Month_of_absence IN (3,4,5) THEN 'Spring'
	WHEN Month_of_absence IN (6,7,8) THEN 'Summer'
	WHEN Month_of_absence IN (9,10,11) THEN 'Fall'
	ELSE 'Unknown'
END AS seasons_name,
	Seasons,
	Month_of_absence,
	Day_of_the_week,
	Transportation_expense,
	Education,
	Son,
	Social_drinker,
	Social_smoker,
	Pet,
	Disciplinary_failure,
	Age,
	Work_load_Average_day,
	Absenteeism_time_in_hours
FROM Absenteeism_at_work a
LEFT JOIN compensation b
ON a.ID = b.ID
LEFT JOIN Reasons c
ON a.Reason_for_absence = c.Number;

And we will use it to create a dashboard using PowerBI with a frame approved by HR departments here.

The wireframe

This is an alt text.

And check my final report here in PowerBI.

Insights and Recomendation from the data

Insight 1

By filtering the Count of Reason, it is simple to identify the 4 most common reasons, all of which are related to the employee's health: 385/740, or 52.02%, that employees mentioned when they asked for absence permission.

Furthermore, if we include all the health-related factors, the percentage could be more than 70%, and it has a significant impact on the employee's performance throughout the week, not only at the beginning of the week.

This is an alt text.

Recomendation:

  • For this reason, HR departments pay attention to the employee's health in their workplace and have a plan to provide insurance for employees by analyzing more employee's data to have a specific policy.

Insight 2

As the report shows in the graph of Sum of Absenteeism time in hour by Day of the week, amount of employees's absence time is highest on Monday of the week and decreases steadily until the end of the week. That's a lot of sense in reality, and there are a lot of factors that could have contributed to this, such as employees unfortunately getting into an unexpected situation for their health reasons (Insight 1) where they couldn't go to work at the beginning of the week, or in another scenario, employees may spend much time on their entertainment activities at the previous weekend.

This is an alt text.

Recomendation:

  • Adjusting or reducing the working time to adapt the employee's reality schedule by adjusting from 8 a.m-5 p.m to 10 a.m - 7 p.m or reducing working hours from 8 a.m - 5 p.m to 10 a.m - 5 p.m, will guarantee employees get high performance at the beginning of the week.
  • Establishing policies that encourage employees to reduce their absence time at the beginning of the week, like providing more employees's welfare on Monday, Tuesday, and Wednesday.