The statistical model behind control charts for in-control processes is based on the assumption a Gaussian process with no autocorrelation (i.e. independent) with a constant mean and constant variance: in other words a white noise process. The various Western Electric rules try to find patterns that are not white noise, and thus show that the process is out of control.
It is quite easy to do a simple experiment to illustrate the flaw in the Western Electric rules. Generate, using some statistical software, several columns of Normal “random” numbers. Then apply the Westinghouse rules. You will see that most of the columns fail the Westinghouse tests, even though they are by definition “white noise.” For example, I generated 10 columns of n=100, with a Normal distribution, and used Minitab to plot I-MR charts and apply all tests. All 10 columns failed at least 1 test.
Individuals SPC chart on simulated Gaussian data. It failed four of the Western Electric rules.
The Western Electric rules in this experiment conclude that all 10 columns are out-of-control, or “non random” (whatever that is), even though I generated the data with a so-called random number generator. The rules state (see here for example) that the probability of an out-of-control process (for one we know to be in control) is very small. For example, the probability that eight points in a row will be on the same side of the centreline is (supposed to be) 1/256. What went wrong?
Several things went wrong. The probability the rules are based on is the probability that the next seven (or whatever) future points fall into some particular pattern (for example, they are all above the mean). These are points that have not even happened yet. This is rarely acknowledged when people explain the rules.
This (false) probability is not useful to someone controlling a process. What is useful is the probability that the process is out of control, given the measurements you already have (and your knowledge about the process and how it works).
Statistical Process Control and six-sigma promoters turn around and pretend that these probabilities are the same. But the probability that you will find such-and-such a pattern on a control chart, using measurement that you already have, is most definitely not equal to the probability that the pattern will occur in the next few measurements. To calculate the probability that particular points that have already been measured indicate an “out-of-control” process, one would have to use a very different procedure than the one used by the Western Electric rules. Our simulation example illustrates this.
The second thing that went wrong is that there are multiple tests on the same data. This changes the probability that there will be a “false alarm,” in other words, a signal that the process is out of control when it is not.
The third thing that went wrong, or that often goes wrong, is the way people interpret the probability. Take the simple rule that says a process is out of control if a measurement goes outside a three standard deviation control limit. The chance of that happening in the next measurement, for a white noise process, is about 0.27%. On average, if we had 10,000 data points, we would expect about 27 to be outside these limits. In another experiment, I generated 100 columns of n=100 Normal (i.e. Gaussian) data, or 10,000 data points and ran that test. There were 27 instances where a point was outside the limit, as expected. These 27 instances came from 24 different columns. It would be tempting to infer that this means that 24 out of the 100 columns were “out of control.” But that is not what it means at all.
Philip Green and George Gabor are co-authors of misLeading Indicators: How to Reliably Measure Your Business, published by Praeger. www.misleadingindicators.com
© 2013 Greenbridge Management Inc.
Phil,
I used to find this when I looked at the WE rules. However, you need to look at how the limits are calculated. Shewhart designed them to ignore certain types of external variability, so that they don’t describe all the variability in a data set. This is to make you search for them remove them and improve the process. These external elements of varibility don’t exist in random numbers. If you generate random numbers then calc the limits using the proper calc for S.D rather than Shewhart calcs you should find the WE rules work fine. Don Wheeler is great read on the history of SPC as Shewhart intended it. He is also much better at explaining these technical issues than me as well. The more we discuss issues like this the better in my view.
Paul—try the experiment with the columns of Normal “random” numbers using a statistical package. Use the proper calculation for standard deviation, and you will get the same result as I describe. The issue is that the WE rules apply to measurements you have not yet taken, not to measurements you have taken. As I stated in the post “the probability that you will find such-and-such a pattern on a control chart, using measurement that you already have, is most definitely not equal to the probability that the pattern will occur in the next few measurements.”
PHil – I tried the test, repeated it a few times just to see. First test 100 random points (Mean = 0 S.D = 1). So the limit zones were simply at 1, 2, and 3 on the run chart.
I saw one out of control point. 7 points one side of the mean.
The next 2 tests were 200 points with the same distribution, this time no out of control points. I wouldn’t consider that too unexpected. I believe that Shewhart never used these rules and didn’t use probablity to derive them, he simply placed his limits to minimise false signals, he didn’t expect that there would no false signals. Just a lot less false signals than an operator would see using experience alone.
This has certainly made me think though about how to compute limits. I was fairly happy with the Shewhart method until this discussion. I can see now that there would be times when this will give lots of false signals for some processes.
Paul, I think Shewhart intended the rules to be used in real-time by an operator on the line. Now we use computers to go back and “mine” long data series. The rules were not intended for that, do not work for that, and will give, as you say, lots of false signals.
Um, im not statistical geek, but it think whats missing is the context of the processes that are considered in control and that are not in control, using numbers to show a formula’s results, is missing the real systems number range and hence the formula used that requires a know range to determin results that apply to the formula.
e.g a process that can have a value of 0.0 to 1.0 as an example where the median is 0.5 show’s a range and a middle point, this i would say is dependent on the process, hence the context, so random numbers cannot be used in a formula that is trying to work out if processes and in control or not.
Highly intellectual but totally misses the systems context being observed via formula