Data is everywhere. In your pocket, on your phone, your laptop, your watch, heck, even your car. But what to do with all this information?
The simplest answer, Analyze it. The more complex answer, Analyze it.
Disclaimer: This is not an in-depth post on every aspect of Data Analytics, purely due to the large scope of Data Analytics. My aim with this post is to provide you with a brief introduction to simplified data analytics in Excel.
Data in its simplest form can be described as a collection of information that we intend to use for some form of reference (the actual definition is off course a bit more complex).
If you think about the Banking industry, each bank keeps records of its customers. These records include your name, surname, ID number, bank account number, address and so forth. All of this information becomes data that the bank can analyze or utilize to look for trends in consumer/client behavior or useful pieces of information to enhance their service and/or make other types of improvements.
The data aids in making better decisions.
When these sets of Data begin to grow larger and larger over time, we tend to refer to them as Big Data (they include huge amounts of information which can be gathered from either a single or multiple sources). In the new digital age and the rise of the “Internet of Things“, the analysis of big data has become quite lucrative if you possess the right knowledge and skills.
You can check out this link for more on Big Data (no really, check it out).
Google describes Big Data as: “extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.“
But merely extracting large sets of data is worthless if the data cannot be analyzed in a meaningful way. Your data needs to be analyzed in a way that tells a story. If your data has no context, it loses meaning.
“The majority of raw data, particularly big data, doesn’t offer a lot of value in its unprocessed state. Of course, by applying the right set of tools, we can pull powerful insights from this stockpile of bits.” – Big Data Analytics: Descriptive Vs. Predictive Vs. Prescriptive
So in order to analyze large sets of data, we make use of specific computer based tools/software to assist us with the analysis (as part of a larger process).
We call this “process”: Analytics (or Data Analytics).
Avinash Kaushik (digital marketing evangelist for Google) defines Analytics as: “Data that you use to make smarter decisions about your business”.
Popular Big Data Analytic Tools include (This is by no means an exhaustive list):
- R (programming language)
- Mothur/Qiime (analysis of bio-informatics data – like DNA sciency stuff)
- Python (“simplified” programming language)
- Microsoft Power BI (data visualization and analysis tool)
- SQL & Hadoop (database languages/frameworks)
- and off course my personal favorite – Microsoft Excel (if you don’t have any Microsoft products, Google Sheets is a great alternative…and it’s free)
For the purpose of this 2-part post, I’ll be focusing on Microsoft Excel to analyze Employee Turnover data as an introduction to data analytics. Although the example used in this post relates to Human Resource information, the principles remain universal.
One of the more important aspects of Data Analytics, is the analysis of patterns that emerge from our data. Patterns can help us in making relative predictions (amongst other things) of what might happen in the future, based on patterns from the past or even current trends.
When applying Data Analytics, I prefer to break the process down into 3 main stages:
Data Collection, Data Analysis and Data Visualization
Stage 1: Data Collection
Data Collection can take place though various methods like surveys or questionnaires (which is what we’ll be using), raw data collection from HR Information Systems, time sheets and so forth.
Data collection takes place daily without us even knowing about it, especially if you make use of a smartphone or smart-device (think Google Maps with Traffic warnings, etc.)
During the data collection phase it’s very important to distinguish between the types of data that you will require for your analysis. Collecting the wrong data can delay your Data Analytics initiative.
The Exit Interview Questionnaire (utilized in this 2-part post) collects two types of data: Qualitative and Quantitative Data.
Image 1: Quantitative vs. Qualitative Research: What’s the Difference?
“Qualitative research gathers information that is not in numerical form. For example, diary accounts, open-ended questionnaires, unstructured interviews and unstructured observations. Qualitative data is typically descriptive data and as such is harder to analyze than quantitative data.” – Simplypsychology.org
“Unlike quantitative research, qualitative research is typically unstructured and exploratory in nature. In this case, the researcher is not interested in determining objective statistical conclusions or in testing a hypothesis, but rather in gaining insights about a certain topic. Common qualitative research techniques include focus groups, interviews, and observation.” – Mymarketresearchmethods.com
“Quantitative research gathers data in numerical form which can be put into categories, or in rank order, or measured in units of measurement. This type of data can be used to construct graphs and tables of raw data.” – Simplypsychology.org
“This research aims to objectively measure the topic at hand, using mathematics and statistics. If you are doing quantitative research, you will most likely be analyzing raw data with the help of a spreadsheet software program like Microsoft Excel, or a statistical package like SPSS” – Mymarketresearchmethods.com
If you’re looking for a few tips on data cleaning you can have a look at the following link: Analytics in HR. We won’t need to clean the data that will be utilized in this 2-part post, but it’s quite useful to know, especially if you’ll be using your own sets of data.
Stage 2: Data Analysis (Analytics)
“There are many different levels to analytics, starting with descriptive analytics (reports, dashboards) and diagnostic analytics (drill-downs, ad-hoc queries), which focus on understanding the past. The next three levels turn their attention to the future and are increasingly augmented by AI technology.” – Forbes
“The purpose of Descriptive Analytics is to summarize what happened” – Dr. Michael Wu.
“What is happening now based on incoming data. To mine the analytics, you typically use a real-time dashboard and/or email reports.” – Ingram Micro Advisor
“A look at past performance to determine what happened and why. The result of the analysis is often an analytic dashboard.” – Ingram Micro Advisor
“Predictive Analytics is the next step up in data reduction. It utilizes a variety of statistical, modeling, data mining, and machine learning techniques to study recent and historical data, thereby allowing analysts to make predictions about the future.” “The purpose of predictive analytics is NOT to tell you what will happen in the future. It cannot do that. In fact, no analytics can do that. Predictive analytics can only forecast what might happen in the future, because all predictive analytics are probabilistic in nature.” – Dr. Michael Wu
“An analysis of likely scenarios of what might happen. The deliverables are usually a predictive forecast.” – Ingram Micro Advisor
“Prescriptive Analytics is a type of predictive analytics. It’s basically when we need to prescribe an action, so the business decision-maker can take this information and act.” – Dr. Michael Wu
“This type of analysis reveals what actions should be taken. This is the most valuable kind of analysis and usually results in rules and recommendations for next steps.” – Ingram Micro Advisor
We’ll skip the 5th type for now.
Image 2: Five Types of Data Analytics (Source)
Stage 3: Data Visualization
Data visualization, in its simplest form, is using graphs and other visual means of explaining what our data means.
In the 2nd part of this post we’ll be utilizing a Word format Exit Interview Questionnaire (that measures certain dimensions – more on this in part 2).
The data collected from this questionnaire will be captured in an Excel workbook and presented in the form of graphs (as a central Dashboard).
Word Questionnaire – Dimension B: (link will be provided in part 2)
Excel Dashboard – Dimension B: (link will be provided in part 2)
If you are interested in applying these techniques in practice, stay tuned for the 2nd installment of this 2-part post.
I promise it will contain a lot less theory and a lot more practice 🙂
I’ll be explaining how to utilize:
- Quantitative Data Collection – Via the Exit Interview Questionnaire;
- Descriptive & Diagnostic Data Analytics – By summarizing the data in order to analyze patterns and causes for employee turnover; &
- Data Visualization – By creating graphs that provide context and user-friendly interpretation of your data.
The Questionnaire used for this 2-part post makes provision for the collection of both Quantitative and Qualitative Data (the link to the Questionnaire and Data sheet will be shared in the 2nd post).
We will only be analyzing the Quantitative data for now (in other words, responses such as Always, Sometimes, Never or Excellent, Good, Poor – not the open ended questions).
In order to follow along you will need a basic understanding of the following operations within Excel (which I’ll also explain in the 2nd installment):
- Sorting; &
- the Formula Bar.
We will be making us of the “=countifs” formula quite a bit, but not too worry, we’ll go through it step by step.
The “=countifs” formula will simulate some of the functionality that you would usually receive from making use if PivotTables.
Until Next Time…
Links and Sources:
- http://www.r-project.org/ (R IDE)
- http://www.mothur.org/ (Mothur)
- http://qiime.org/ (Qiime)
- http://www.python.org/ (Python)
- http://powerbi.microsoft.com/en-us/ (Microsoft Power BI)
- http://hadoop.apache.org/ (Hadoop Framework)