My Favorite Statistical Tool - ANOVA
What Is It
ANOVA is my favorite statistical tool because it is a powerful tool used for analyzing large sums of data and it gives you the flexibility to be able to dissect certain parts. I have been using this tool for several years now since my statistics days back in college and fell in love with it. Sorry to my wife, I have a slight crush on ANOVA. ANOVA is an acronym for Analysis of Variances. In my blog, we don’t talk about the theory behind the math of ANOVA, so we will not go there. Basically, what we are doing with ANOVA is determining if there is a relationship between the things you can influence and the outcome that those things have onto the main item you are trying to improve.
When to Use It
You would use this tool if you were testing out a few things in your process to see if it impacts the result of your process in a positive or beneficial way. You are aiming to optimize, in an example such as quality errors optimize would mean to minimize and in the case of sales dollars optimize would mean maximize. So basically, you want to see what things can make your process good.
How to Use It
Most statistical software packages have this tool available. It is relatively easy to run once you have in your possession the software and the data is organized in the correct format. Some of those software packages that can be used are:
Minitab (my personal favorite because of its simplicity and flexibility)
JMP (this tool is pretty good for DOE – Design of Experiments)
Statistica (I like using this one because of the nice graphics)
SAS (gets more complicated to use but very powerful)
R (this one is nice because it is free)
One other little secret is, it is even available in Excel within the Analysis ToolPak, however it is not user-friendly and the output tends to be cryptic results without the ability to create really nice charts like other software above. Nonetheless, it does produce the results for you.
Primary Elements of ANOVA
I am not going to cover any details that may throw you off, I will only get to the point of what you need to know.
Response/Main Effects – This is the item that you are trying to optimize in the example below, this will be represented by Flight Time.
Factors – These are the things that you want to determine if it impacts the response/main effects. You are testing if they at all influence what you are trying to optimize. In the example below, the factors we will consider are: a) Paper type, b) Rotor length, c)Leg length, d) Leg width, e) Paper Clip
Levels - This represents the multiple variations that the factors can be in, for this example I have limited it to 2 levels for each factor. For example, Paper Type will have 2 levels 1) Light and 2) Heavy. However, in the example in the video, I cover 5 levels for the example of geographical location, 1) East, 2) West, 3) South, 4) North, & 5) Central.
That’s it, think about those 3 things, no need to make this more complex, because this can quickly turn complex.
Walk Through Example
If you were going to organize your data, you would have one column labeled response/main effect, and then each column would be labeled with the appropriate factor. To ensure the testing is adequate, make sure that every variation is tested, you will need to write it out to make sure of that, but for this example it would look something like this.
For this example, you would ensure that you Light and Heavy Paper type is tested with the various Leg Lengths of 7.5 and 12.0. In addition, you need to ensure that the combinations are spread across all of the levels. You would accomplish this during your testing or data sampling and the nice part about the software is that it will sort through that information and will perform the various combinations and look at the response/main effects accordingly. After you have performing your testing and then put it into the table and then let it run. You will get a result similar to this.(I apologize for showing you a different ANOVA results from what we are covering but the table would look the same however with the elements tied to Flight Time.)
This table may be intimidating, but for the purposes of what we are trying to do is, I would look at the P-Value (on the far right side) for anything with a value of 0.05 or lower. Those would be the factors or (source) as it is called here, and that is something that does influence the response/main effect. In this ANOVA response, there was one item which had a p-value less than 0.05, and that was 'Supplement', this would be considered a factor that does influence the outcome/response/main effect.
In addition, the chart of the impact onto the response/main effects, provides a great visual depiction for understanding what impacted the flight time the most. As can be seen Paper Type seems to be the most, whereas Leg Width doesn’t impact much.
That was my attempt at helping you to at least understand the basics of ANOVA. There are several ways that it turns complicated. But if you understand this, you understand about 80% of it. You can now speak about this tool with some level of understanding. That was an introduction to my favorite statistical tool!
My hope is that you run into a situation when you can propose performing an ANOVA analysis and can speak with confidence of what you are looking for.
* Have you ever used ANOVA and had it reveal something profound for you?
* What is your favorite statistical tool?