Whether you’re a novice in the field of people analytics, or a seasoned expert with countless years of experience, the one thing you’ll need to show off your exceptional skillset in data analytics is Data.
I’ve compiled a free Excel Workbook that contains prepopulated, randomized Human Resource (HR) data for the following categories:
- Organizational Structures (Finance, HR, Production, etc.);
- Generic Employee Data – for 300 fictional employees (remuneration data, position data, EE levels, etc.);
- Recruitment data;
- Termination data; &
- Promotion data.
If you want to learn how to generate your own randomized HR data in Excel, in order to test statistical models or theories or just to get accustom to working with HR related data, you can follow along in the tutorial below. I will show you how to make use of the “=randbetween” and “=vlookup” functions in Excel in order to achieve this outcome.
Alternatively, if you would like a copy of the prepopulated template, you can send me an InMail via LinkedIn or simply leave a request in the comment section below.
Disclaimer: All information contained in this template is purely fictional. Should there be any resemblance to anyone you may or may not know – within this template – it is purely coincidental. The template may only be used for educational purposes.
How to Generate Randomized HR Data
In order to create randomized employee data I utilize two formulas within Excel:
- “=randbetween” &
There are various other ways to perform the same task in Excel, but I prefer this method.
=randbetween (=randbetween – “random”+”between”)
The “=randbetween” function in Excel generates random values between a maximum and minimum number. Very similar to that of a die.
A die has a minimum value of 1 and a maximum value of 6. So, at any given time, the die will only be able to provide a random number between 1 and 6 (including 1 and 6)
To create random employee data in Excel, we will assign numbers to specific values within an Excel spreadsheet (I know this might not make sense right now, but bear with me).
Each employee (in our fictional organization) lives within a certain province in the Country. I wish to randomly assign a Province to each employee without the old “copy” & “paste” method.
To do this, I have to create two seperate tables.
In Table 1 we manually assign a number to each Province (from 1 to 9). The order in which the numbers are assigned does not matter as long as each Province has its own number. You cannot assign the same number to different provinces (you will see why, when we discuss vlookups).
In Table 2 we utilize the “=randbetween” formula to randomly generate a number bewteen 1 and 9.
We enter the following formula in the first cell under the “Random_Number” column heading:
The result in our first cell of the “Random_Number” column is the number “4” (yours might generate a different number).
You can either copy and paste the formula in each of the cells of the “Random_Number” column or use the quick-fill option.
After the random number has been generated (in the “Random_Number” column), we make use of a =”vlookup” formula in Table 2, to look for the Province that “belongs” to each number (from Table 1).
With our example, because the result is the number “4”, the corresponding provicne would be “KwaZulu-Natal”.
If you work with Remuneration data, you will most likely be familiar with the Paterson remuneration bands/scales.
In Table 1 we manually assign a number to each Paterson Band (from 1 to 11). Once again, the order in which the numbers are assigned does not matter as long as each Band has its own number. You cannot assign the same number to different Bands.
In Table 2 we utilize the “=randbetween” formula to randomly generate a number between 1 and 11. After the random number has been generated (in the “Random_Number” column), we make use of a =”vlookup” formula in Table 2, to look for the Paterson Band that “belongs” to each number (from Table 1).
Very Important Notes:
- Whenever you change/amend any data within your spreadsheet all “=randbetween” formula’s will generate new values.
- The bottom/lowest value in your “=randbetween” formula cannot be greater than your top/highest value i.e. “=randbetween(9,1)” won’t work.
Stop Random Values from Generating New Values
Random values will continuously generate as longs as the “=randbetween” formula is still active in your spreadsheet.
To prevent this from happening follow theses steps:
- Select all the values in the “Random_Number” column (the column that contains your “=randbetween” formula)
- Click on “Copy” in the top left corner
- Click in “Paste” in top left corner and select the “Values” button (shown as a clipboard with the number 123 on it)
- Once you clicked the “Values” button, the formulas will be replaced by the current value of the cell. The formulas will therefore not generate new values because they have been replaced with a fixed value.
A =”vlookup” function assists in finding corresponding data in different tables (whether in the same sheet or within different sheets or workbooks).
Say for instance you have 2 spreadsheets (see images below)
- Sheet 1 contains the Employee Numbers and Names of 100 employees.
- Sheet 2 contains the Employee Numbers and Surnames of the 100 employees above and the surnames of 50 other employees (that you don’t want to use).
You want to consolidate the 2 spreadsheets by adding the surnames of the 100 employees, to sheet 1. In other words, you want their names and surnames in the same spreadsheet.
You could sort the data according to employee number in each sheet and then manually copy and paste the surnames from the one sheet to the other, but what about the 50 employee surnames that you don’t need. Selecting the 50 surnames one by one and then deleting them wastes a lot of time (and just imagine if you were to work with much larger sets of data).
This is where a “=vlookup” function comes in quite handy.
The “=vlookup” function will consolidate the two sheets by only adding the surnames of the 100 employees in sheet 1 and ignoring the 50 surnames that you don’t need.
So how does this apply to our randomized data?
Let’s look at the Province example (earlier in this post).
Step 1: Create 2 Spreadsheets, one titled “Employees” and the other titled “Randomgen”.
Step 2: In the “Randomgen” sheet add the following details:
Step 3: In the “Employees” sheet add the employee details (as shown below) and remember that under the “Random_Number” column you want to include the “=randbetween(1,9)” formula that we discussed earlier in this post.
Remember, we create the formula “=randbetween(1,9)” in the cell where we want the random number to be displayed (in this case cell C2 to C10).
Your random numbers might not look the same as the image above, because the numbers are random after all.
Now that we’ve created our two spreadsheets, lets consolidate the two by adding the Province data into the “Employee” spreadsheet.
Step 4: (Numbers correspond to images below)
- Select the cell where you want the Province data to be displayed, in this case, cell D2.
- Click on the “fx” button, to insert a function and the function wizard should pop-up.
- Click on the dropdown menu arrow
- Select “VLOOKUP” from the dropdown menu
- Press “OK”
If done correctly, the following Vlookup wizard should pop-up:
So how exactly does it work?
The “vlookup” wizard makes provision for 4 “vlookup” parameters/arguments:
The first block in the Vlookup Wizard is called the “Lookup_Value”. This value is very similar to your own personal ID or Passport Number. If I had your ID or Passport number, then I could find out some information about you. Who you are, how old your are, etc.
In this instance the [Lookup_Value] will be the “Random_Number” column in the “Employees” sheet.
The random numbers in the “Employees” sheet will help us to pull the corresponding Province name from the “Randomgen” sheet.
Our [Lookup_Value] will therefore be “C:C”, meaning the whole of column C in our “Employees” sheet.
In other words, if I type the number “2” in the “Random_Number” column in the “Employees” sheet, then the “=vlookup” formula will display “The Free State” under the “Province_value” column
Now you might be thinking at this stage “WHAT???”, but hang in there. It’ll all make sense.
For a “vlookup” to work, there must at least be one shared value in all the sheets that you want to use.
So, if one sheet contained a “Random_number” column, but the other sheet did not contain a column with the same data as the “Random_number” column, the “=vlookup” function won’t know what to look for. It wont know that the number “1” belongs/corresponds to “The Eastern Cape”.
The [Table_array] refers to the table that contains the data we want.
In this case, the “Randomgen” sheet contains a table of data with the names of Provinces.
We want the names of these provinces to be displayed in the “Employees” sheet.
The province table is therefore our [Table_array]
Our [Table_array] would therefore be:
- This means all the data contained in the Province table, in the “Randomgen” sheet, from Column A to B is the table that we want look in, for information.
If the data we wanted was in Column C, the [Table_array] would be:
- This means all the data contained in the Province table, in the “Randomgen” sheet, from Column A to C is the table that we want look in, for information.
The [Col_index_number] refers to the number of the column that contains the data we want.
- We want the “Province_value” from the “Randomgen” sheet.
- The “Province_value” data is in Column 2 in the “Randomgen” sheet. Don’t get confused about the A, B, C, etc. at the top of each column. In reality each column is still numbered (from left to right).
- Column A’s [col_index_number] is “1”
- Column B’s [col_index_number] is “2”
- Column C’s [col_index_number] is “3”, and so on.
For this example, the “col_index_number” is “2” because the data is contained in column “B” on the “Randomgen” sheet.
The easiest of the 4 parameters to remember.
The value is almost always “0” – zero, meaning false/exact match.
What this tells Excel is that we are looking for an exact match. If our [Lookup_value] is the number “1”, then Excel must provide me with the information/value that corresponds with the number “1”. In the this case, the value will be “The Eastern Cape” as long as your [Range_lookup] value is “o” – zero.
This is what the completed “vlookup” wizard should look like:
Once you’ve complete the vlookup parameters you can click OK and quick-fill the formula to all the other cells.
- Lookup_value – [ C:C ] – of the “Employees” sheet.
- Table_array – [ Randomgen!A:B ] – of the “Randomgen” sheet.
- Col_index_num – [ 2 ] – of the “Randomgen” sheet.
- Range_lookup – [ 0 ]
I hope that this post was both informative and educational.
If you have any suggestions or requests, I would love to hear from you.
Until next time…