Developing an HRIS Data Coding Scheme¶
Data quality, or “fitness for use,” is of paramount importance in a human resources information system (HRIS). Quality can be compromised if the data include duplicates, unnecessary variations, misspellings or omissions. If the data in the system are inaccurate, incomplete or out-of-date, the reports generated by the system will be of no value. Furthermore, if reports based on poor quality data are made public, they may generate a lack of trust in the entire HRIS. (For more information about data quality, please see “Data Quality Considerations in Human Resource Information System (HRIS) Strengthening,” (PDF) included in the HRIS Strengthening Implementation Toolkit .)
To prevent problems during data entry, the data should be categorized in a consistent, standard way. Data coding is the process of classifying data in preparation for later analysis. Before any data collection takes place, stakeholders and HRIS managers should spend time developing a coding structure to organize the data in the system. This data-coding structure should reflect the way the health system is organized in real life, such as how districts are organized within a country and how jobs are organized within a facility.
If existing workforce data are available for immediate entry, taking the time to develop a data-coding scheme may be perceived as an obstacle to forward momentum. However, an initial investment in organizing a system of data entry requires fewer resources and less time than does retrospectively correcting data-entry errors. In addition, data entered according to a standardized coding scheme are more likely to be immediately useful for creating reports and drawing comparisons at the facility and district level.
Why Is a Standard Data-Coding Scheme Necessary?¶
To understand why creating an HRIS data-coding scheme is such a high priority, picture what the HRH database would look like if all of the information for two categories, such as Job Title and Department, were typed directly into a spreadsheet. Imagine that three people who hold the same job in a single healthcare facility (Nursing Officer in the Obstetrics and Gynecology Department) are asked to record their job titles and department names on a data collection form. It is possible that they would write down this information in three different ways, as in the example below.
Unique ID # Employee Name Job Title Department Name
101 Nurse A Nursing Off.– OBGYN OBGYN
102 Nurse B Nursing Officer Obs. & Gyn.
103 Nurse C Registered Nurse Officer Obstetrics and Gynecology
Now suppose that all the health workers at the facility, even those who hold the same job, enter their information using different versions of the same job titles and department names. Perhaps some of the data entries are misspelled as well, while others are entered using unclear abbreviations. Technically, the data in the spreadsheet would be accurate, since each of the health workers wrote down a job title or department name that represented his or her job at the facility. However, because the data entry clerks did not have a standard way to code the data, the information in the dataset is difficult to analyze. For example, it would not be possible for a healthcare decision maker run a report to quickly find the number of Nursing Officers at the facility.
Creating a standard way to organize health workforce data allows users of the system to easily aggregate data about a specific variable. For example, after the implementation of an HRIS data-coding scheme, the information about the Nursing Officers at the facility would be entered into the spreadsheet as follows:
Unique ID # Employee Name Job Title Department Name
101 Nurse A Nursing Officer Obstetrics and Gynecology
102 Nurse B Nursing Officer Obstetrics and Gynecology
103 Nurse C Nursing Officer Obstetrics and Gynecology
Use of a standardized data-coding scheme enables the system to automatically find how many Nursing Officers work in the Obstetrics and Gynecology Department at this facility and display that information in a report.
In addition, a standard classification system enables HRIS users to more easily compare information between facilities and districts. Classification systems that conform to international standards even allow for comparisons between different countries.
Creating a Data Dictionary¶
A data dictionary , also referred to as a codebook , is a written record of all of the codes used in the database and how they correspond to the data. Establish the data dictionary when the data-coding scheme is created. The data dictionary should be as clear and explicit as possible so that someone who has no knowledge of the codes can easily look up each value and find out what it means.
Log all of the different data points that correspond in the data dictionary, including additional names, alternative spellings and abbreviations. Update the data dictionary every time a new value is added or an alternate name for a value is discovered.
For instance, the data dictionary entry for the Nursing Officer example may read as follows:
Category/Field Value Alternatives
Job Title Nursing Officer Nursing Off. – OBGYN, Registered Nurse Officer
The data dictionary may also include information about the different levels of a job. For example, is a Nurse Assistant I a more senior job than a Nurse Assistant II, or are these jobs at the same level but with different responsibilities? Any clarifications that may be important for data entry or useful during data analysis should be included in the data dictionary.
How Do We Create a Data-Coding Scheme Without Prior HRH Information?¶
Most healthcare systems collect some type of HR information for workforce tracking and payroll management. While it seems unlikely that any country would need to create an HR coding scheme completely from scratch, in some cases (such as destruction of paper personnel files or loss of an HR database that was not backed up) it may be necessary to create an HRH database based on a very limited amount of initial information.
To create an HRIS data-coding scheme, begin by brainstorming the types of information that a healthcare stakeholder would need to have in order to make good decisions about the health workforce. For example, for payroll purposes, the stakeholder would need to know the names and addresses of employees. To make good staffing choices, the stakeholder would need to know about types of training, certification and licensure. To manage the workforce, the stakeholder would need to know what jobs are needed, as well as information about departments and facilities.
Once each of these categories, or fields (e.g., Employee Surname, Employee Address, Facility Name, job Title, Department, etc.), has been identified, think about which values would belong in each category. These values will be listed in the drop-down menus of the data entry forms.
You will be able to list all of the values for a few categories. For example, the category Martial Status will only have a few values in its drop-down menu: Single, Married, Domestic Partner, Divorced, Widowed, and possibly Nun/Clergy. The category Facility Name will also have a finite number of values, consisting of the names of all of the facilities in the district or country.
For a few of the categories, such as Employee Surname or Employee Address, so many possible values exist that it does not make sense to use a drop-down menu. These fields should be left blank on the data entry form. Data entry clerks will have to type a new value into each cell, rather than select a value from a drop-down menu.
The third and largest group of categories, such as Job Title and Department Name, will require a list of values for the drop-down menu in order to maintain data quality and consistency. However, creating a complete list of these values for a drop-down menu requires a strong knowledge of the healthcare system and input from HRIS stakeholders. To generate lists of values for these fields, it may be useful to refer to the resources listed at the end of this brief. While these resources may be valuable in the beginning stages of creating a coding system, determining country-specific values will require some research. A survey of all of the jobs in chosen local health facilities should provide a clearer picture of the types of job titles, department names, etc. that need to be included in the data-coding scheme. Input from key HRIS stakeholders is essential during this stage of coding scheme development.
Resources¶
The following resources may be useful when creating an HRIS coding scheme: