“Why has this performance indicator gone down three months in a row?”
Regular readers of this blog will know the answer to this question already, without any knowledge of the indicator in question, or the work being measured.
“It’s probably just random noise”
“That’s common cause variation”
“3 data points doesn’t say s****”
The words might be different, but the message is the same: chill. There’s not enough in 3 decreasing data points to say anything has changed.
But I was asked that very same exact identical question recently and thought hang on, what are the chances that I’m wrong?
I mean, jumping to conclusions and knee jerk reactions is what I accuse others of aaaaaallll the time. So I thought I’d work it out using all maths and snit.
We have 22 months worth of data, it’s a pretty silly indicator, doesn’t really measure anything business critical but it is still one that people are interested in. So and hence it is on an official scorecard reported in the usual scorecard way of this month against last month with an associated up or down arrow, you know, binary comparisons ahoy…
The problem with putting an indicator in a table like this, it chops it all up into individual pieces and compares the latest piece with some other arbitrary number, like a target or same time last year. It loses EVERYTHING that helps you find out about the thing being measured.
So first, let’s put it all back together again and have a look at the data in context. Let’s open up this baby and see what’s under the hood.
Ahem…. I’ll stick it in a graph.
This is just a bog standard run chart, with the median indicated on it. No control chart AS YET because if you’ve a control chart THEN you’ve got to start explaining stuff about variation, which I just don’t have the energy for and nobody gives a damn either.
But RUN CHARTS, that’s a line graph, it’s from excel. It’s NOT SCARY. Nothing needs explaining with a run chart other than what’s in it.
And the cool thing about a run chart is you STILL can detect, and more importantly UNdetect “trends”.( I hate trends. Trends are to Excel, what death by bulletpoint is to PowerPoint. A built in function that substitutes a default Microsoft approved solution for ACTUAL thinking.)
“So, tell me Einstein, what’s that run chart tell YOU?“, Says an imaginary reader.
Remember that question at the very beginning? Here, I’ll shout it at you again…
“Why has this performance indicator gone down three months in a row?”
The question is about the last three going down. Now, I’m going to have to get all mathematical on your ass. Not VERY, cos I can’t be bothered working it all out correctly so nobody will spot all my mistakes.
But take a good look at that run chart again…
(I’m assuming you know about noise and signal, about the world being chock full of noise and only a bit of signal. That there is common cause variation and special cause variation.
Let’s assume you do. (if not, read all around here).)
In a run chart you can tell if there has been any shifts in the underlying process producing this measure. No need for a control chart! This discovery changed my life! Turns out there’s a whole load of knowledge about run charts I’d not been privy to.
A run chart acts as Filter Number One as Davis Balestracci calls it, find out ““Did this process have at least one shift during this time period?”.
This question is KEY. It tells you whether you are looking at a stable process. As My New Best Friend Davis says…
This generally is signaled by a clump of eight consecutive points either all above or below the median. If the process did have such a shift, then it makes no sense to do a control chart at this time because the average of all these data doesn’t exist.
Sort of like, “If I put my right foot in a bucket of boiling water and my left foot in a bucket of ice water, on the average, I’m pretty comfortable.”
So there IS no point in looking at a control chart right away, see what the run chart tells you first, see if there is actually ONE process there at all.
This run chart shows there are not 8 consecutive points above or below the median, so it’s all one set of data produced under the same conditions. It’s one system. Same things happening throughout.
So lets see that thing at the end, the 3 data points in a row that decreased, in the context of a process producing a stable set of data, what’s THAT mean then?
Well, a run of 3 data points all in one direction in a process that is stable, my theory is that it’s just noise. It is common cause variation. The alternative mental model of this is the question I was asked, these 3 decreasing data points mean something, that they are signal. Let’s do a robot wars fight off between these two!
If there WERE just common cause variation in a set of 22 data points, then this means that any of the data points is as likely to go up as well as it is go down. There is no signal in any particular movement of any single data point.
This is because randomness is CLUMPY. You don’t expect it, but in random strings of data there will be things that look not random. Ie in a string of flipping a coin you will get clumps of heads and clumps of tails. Like this!
Here each T is a tail, and H is a head. See how there are patterns, as the number of data points increase, the number of seeming odd patterns will increase too. There is more noise as the number of datapoints increase, and as a proportion of it all the signal reduces as the noise increases.
This is why people need to understand statistics because they are already using statistics.
Below is a lovely table from that smashing man Davis, and here we can see that for any given amount of datapoints there are a minimum and a maximum number of runs that you should expect to see.
In this case there are 22 datapoints, and therefore we should expect to see between 7 and 16 runs. This is expected noise! We should see noise! And we do!
In fact if we were to apply a bit more of a thinking cap, we can work out what are the chances of getting 3 or more datapoints going down in a row.
In a process with simple common cause variation, im going to assume (FLASHING RED LIGHT FLASHING RED LIGHT!) that for any single data point there is a 50% chance of it going up as it is to go down. If you want to know for any dataset with each datapoint having a 50% chance of being something or another, what are the chances of there being X amount in a row having this characteristic, its just like the flipping coins example. basically I want to know in any set or 22 coin flips what are the chances of 3 in a row being heads (or tails). This is the same as me wanting to know the probability of in 22 datapoints with a 50% chance of being up or down, that there are 3 in a row going down.
In fact, i’m going to make it HARDER, cos what happens if there were 3 or MORE in a row, not just 3 exactly. Whats the chances of that then?
Turns out, it’s 99.994% that there will be 3 or more going down, in my set of 22 datapoints.
ie, a racing certainty. Odds on. Bet your house on it!
You can work it out using excel!
So, there we have it. The answer. 99.994% chance of seeing by pure random chance three months in a row of this measure going down.
I think I’m in need of an open top bus parade with all that maths.
Now for the actual hard bit….explaining it to someone not systemsy.