Thursday, 21 July 2016

Towards Prescriptive Asset Maintenance

Every utility deploys capital assets to serve its customers.  During the asset life cycle an asset manager repetitively must make complex decisions with the objective to minimise asset life cycle cost while maintaining high availability and reliability of the assets and networks. Avoiding unexpected outages, managing risk and maintaining assets before failure are critical goals to improve customer satisfaction. To better manage asset and network performance utilities are starting to adopt a data driven approach. With analytics they expect to lower asset life cycle cost while maintaining high availability and reliability of their networks. Using actual performance data, asset condition models are created which provide insight on the asset deterioration over time and what the driving factors of deterioration are. With this insights forecasts can be made on the future asset and network performance. These models are useful, but lack the ability to effectively support the asset manager in designing a robust and cost effective maintenance strategy.

Asset condition models allow for the ranking of assets based on their expected time to failure. Within utilities it is common practice to use this ranking in deciding which assets to maintain. By starting at the assets with the shortest time to failure, assets are selected for maintenance until the budget available for maintenance is exhausted.  This prioritisation approach will ensure that the assets most prone to failure are selected for maintenance, however it will not deliver the maintenance strategy with the highest overall reduction of risk. Also the approach can’t effectively handle constraints in addition to the budget constraint. For example constraints on manpower availability, precedence constraints on maintenance projects, or required materials or equipment. Therefore a better way to determine a maintenance strategy is required taking into account all these decision dimensions. More advanced analytical methods, like mathematical optimization (=prescriptive analytics), will provide the asset manager with the required decision support.

In finding the best maintenance strategy the asset manager could instead of making a ranking, list all possible subsets of maintenance projects that are within budget and calculate the total risk reduction of each subset. The best subset of projects to select would be the subset with the highest overall risk reduction (or any other measure). This way of selecting projects also allows for additional constraints, like required manpower, required equipment or spare parts, time depended budget limits, to be taken into account. Subsets that do not fulfil these requirements are simply left out. Also, subsets could be constructed in such a manner that mandatory maintenance projects are included.  With a small number of projects this way of selecting projects would be possible, 10 projects would lead to 1024 (=2^10) possible subsets. But with large numbers this is not possible, a set of 100 potential projects would lead 1.26*10^30 possible subsets which would take too much time, if possible at all, to construct and evaluate them all.  This is exactly where mathematical optimisation proofs its value because it allows you to implicitly construct and evaluate all feasible subsets of projects, fulfilling not only the budget constraint but any other constraint that needs to be included. Selecting the best subset is achieved by using an objective function which expresses how you value each subset. Using mathematical optimisation assures the best possible solution will be found. Mathematical optimisation has proven its value many times in many industries, also in Utilities, and disciplines, like maintenance. MidWest ISO for example uses optimisation techniques to continuously balance energy production with energy consumption, including the distribution of electricity in their networks. Other asset heavy industries like petrochemicals use optimisation modelling to identify cost effective, reliable and safe maintenance strategies.

In improving their asset maintenance strategies, utilities best next step is to adopt mathematical optimisation. It allows them to leverage the insights from their asset condition models and turn these insights into value adding maintenance decisions. Compared to their current rule based selection of maintenance projects in which they can only evaluate a limited number of alternatives, they can significantly improve as mathematical optimisation lets them evaluate trillions (possibly all) alternative maintenance strategies within seconds. Although “rules of thumb”, “politics” and “intuition” will always provide a solution that is “good”, mathematical optimisation assures that The Best solution will be found.  

Tuesday, 19 July 2016

Big Data Headaches
Data driven decision making has proven to be key for organisational performance improvements. This stimulates organisations to gather data, analyse it and use decision support models to improve their decision making speed and quality. With the rapid decline in cost of both storage and computing power, there are nearly no limitations to what you can store or analyse. As a result organisations have started building data lakes and invested in big data analytics platforms to store and analyse as much data as possible. This is especially true in the consumer goods and services sector where big data technology can been transformative as it enables a very granular analysis of human activity (up to the personal level). With these granular insights companies can personalise their offerings, potentially increasing revenue by selling additional products or services. This allows for new business models to emerge and is changing the way of doing business completely. As the potential of all this data is huge, many organisations are investing in big data technology expecting plug and play inference to support their decision making. The big data practice however is something different and is full of rude awakenings and headaches.

That big data technology can create value is proven by the fact that companies like Google, Facebook and Amazon exist and do well. Surveys from Gartner and IDC show that the number of companies adopting big data technology is increasing fast. Many of them want to use this technology to improve their business and start using it in an exploratory manner. When asked about the results they get from their analysis many of them respond that they experience difficulty in getting results due to data issues, others report difficulty getting insights that go beyond preaching to the choir. Some of them even report disappointment as their outcomes turn out to be wrong when put into practice. Many times the lack of experienced analytical talent is mentioned as a reason for this, but there is more to it. Although big data has the potential to be transformative, it also comes with fundamental challenges which when not acknowledged can cause unrealistic expectations and disappointing results. Some of these challenges are even unsolvable at this time.

Even if there is a lot of data, it can’t be used properly

To illustrate some of these fundamental challenges, let’s take an example of an online retailer. The retailer has data on its customers and uses it to identify generic customer preferences. Based on the identified preferences offers are generated and customers targeted. The retailer wants to increase revenue and starts to collect more data on the individual customer level. The retailer wants to use the additional data to create personalised offerings (the right product, at the right time, for the right customer, at the right price) and to make predictions about future preferences (so the retailer can restructure its product portfolio continuously). In order to do so the retailer needs to find out what the preferences of its customers are and the drivers of their buying behaviour. This requires constructing and testing hypotheses based on the customer attributes gathered. In the old situation the number of available attributes (like address, gender, past transactions) was small. Therefore only a small number of hypothesis (for example “women living in a certain part of the city are inclined to buy a specific brand of white wine”) can be tested to cover all possible combinations. However with the increase in the number of attributes, the number of combinations of attributes that are to be investigated increases exponentially. If in the old situation the retailer had 10 attributes per customer, a total of 1024 (=210) possible combinations needed to be evaluated. However when the number of attributes increases to say 500 (which in practice is still quite small), the number of possible combinations of attributes increases to 3.27 10150  (=2500) This exponential growth causes computational issues as it becomes impossible to test all possible hypotheses even with the fastest available computers. The practical way around this is to significantly reduce the number attributes taken into account. This will leave much of the data unused and many possible combinations of attributes untested, therefore reducing the potential to improve. This might also cause much of the big data analysis results to be too obvious.

The larger the data set, the stronger the noise

There is another problem with analysing large amounts of data. With the increase in the size of the data set, all kinds of patterns will be found but most of them are going to be just noise. Recent research has provided proof that as data sets grow larger they have to contain arbitrary correlations. These correlations appear due to the size, not the nature, of the data, which indicates that most of the correlations will be spurious. Without proper practical testing of the findings, this could cause you to act upon a phantom correlation. Testing all the detected patterns in practice is impossible as the number of detected correlations will increase exponentially with the data set size. So even though you have more data available you’re worse of as too much information behaves like very little information. Besides the increase of arbitrary correlations in big data sets, testing the huge number of possible hypotheses is also going to be a problem. To illustrate, using a significance level of 0.05, testing 50 hypothesis on the same data will give at least one significant result with a 92% chance.

P(at least one significant result) = 1 − P(no significant results) = 1 − (1 − 0.05)50 ≈ 92%

This implies that we will find an increasing number of statistical significant results due to chance alone. As a result the number of False Positives will rise, potentially causing you to act upon phantom findings. Note that this is not only a big data issue, but a small data issue as well. In the above example we already need to test 1024 hypotheses with 10 attributes.

Data driven decision making has nothing to do with the size of your data

So, should the above challenges stop you from adopting data driven decision making? No, but be aware that it requires more than just some hardware and a lot of data. Sure, with a lot of data and enough computing power significant patterns will be detected even if you can’t identify all the patterns that are in the data. However, not many of these patterns will be of any interest as spurious patterns will vastly outnumber the meaningful ones.  Therefore, with the increase in size of the available data also the skill level for analysing the data needs to grow. In my opinion data and technology (even a lot of it) is no substitute for brains. The smart way to deal with big data is to extract and analyze key information embedded in “mountains of data” and to ignore most of it. You could say that you first need to trim down the haystack to better locate where the needle is. What remains are collections of small amounts of data that can be analysed much better. This approach will prevent you from getting a big headache from your big data initiatives and will improve both speed and quality of drive data driven decision within your organisation.