Saturday, 27 February 2010

What’s wrong with Average?

“I'm average looking and I'm average inside. I'm an average lover and I live in an average place. You wouldn't know me if you met me face to face. I'm just your average guy.”

These few lines taken from a song by Lou Reed are about being average. We use averages all the time. To name a few, your IQ-score, the batting average of your favourite baseball player, the average time to complete a project. When creating next year’s business plan the expected future revenue is used, the expected budget requirements also. In a cross dock the average productivity of the floor workers is used to estimate the processing times to unload or load the trucks. Plans based on averages however are below projection, behind schedule and beyond budget. That is because of Jensen’s inequality. When using averages for input of the plans or models, we expect to get the average outcome. That holds in only a few cases in practice. The numbers that we average are actually random numbers, for example the duration of unloading a truck. In plans we do not use all possible outcomes of the random variable but use the average for simplicity instead, not realizing that we are possibly making a big mistake. Sam Savage calls it The Flaw of Averages. One of the examples he uses to explain the flaw is the drunk on the highway. The state of the drunk at his average position is alive, but the average state of the drunk is dead.

More and more companies and government agencies are aware of the value that analytics can bring them. With the availability of Excel doing your own bit of analytics is easy. I agree, Excel is helpful, but like any other tool in inexperienced hands, things can go wrong easily, especially when averages come to play. It is also an area where I come across the flaw of averages a lot. To give you an example, picture a cross dock operation. Trucks will arrive at a certain moment in time. The trucks will be unloaded; the material in the truck will be processed (like sorting or repacking) and loaded on trucks again leaving towards the next destination. The question that the cross dock manager has is how many floor workers he needs to process all the goods passing thru in such a way that all trucks can leave on time again. He start’s up Excel and puts in the scheduled arrival and departure times of the trucks. That was the easy part, although he knows that the trucks from the north arrive on average 10 minutes late. Next question to answer is how much time is required to offload or load a truck. Actually there are two questions; first the volume on the truck is required, next to it the number of items a floor worker can take out, sort and put into the next truck per hour. The cross dock manager does some fact finding and comes up with an average volume per truck and an average productivity for the floor workers. Using the averages he calculates the amount of floor workers required, easy.

After a few weeks the cross dock manager is not that satisfied. His customers complain that the trucks are leaving to late half the time, suggesting that that there are not enough floor workers. The manager has no clue on what is wrong. When he goes out to verify the numbers he used to calculate the number of floor workers, he finds the same averages on volume and productivity. Surely he has become a victim of the flaw of averages. To see why, for simplicity assume that all the trucks arrive at the same time and leave at the same time. Also assume a sort window (the time between trucks arriving and departing) of 1 hour. Also assume that the volume in the trucks is fixed at 100 items. What is left is the productivity of the workers. Assume that a worker can on average unload, sort and load 10 items per hour. Then, on average, 50 workers are required to process the freight of 5 trucks.
Now think of this, the floor workers productivity is on average 50, but not all floor workers will have that productivity all the time. Some will work faster, others slower, due to many reasons. When a floor worker has a lower productivity he will need more time to process the items from the truck and therefore the truck will leave behind schedule. A small Monte Carlo experiment shows the impact of a 10% variation of productivity.

As you can see, the required number of floor workers runs from 40 up to 75. The average number required workers is still 50, but the variation in productivity results in shortages nearly half the time, resulting in unsatisfied customers (and the manager’s headache). To help him a little assuring that in 95% of time he has enough floor workers, he should hire 60 workers. I left out the actual arrival times of the trucks and the actual number of items on the trucks, these will complicate the analysis, but with the use of Monte Carlo analysis these dynamics can be modelled with ease. So instead of using just an average, use all the information you have. This will lead to better plans, and fewer headaches.

Nothing wrong with being an average floor worker, but plans based on average assumptions are wrong on average!