Data Quality – Let the Battle Commence!

by Andy Crossley

The battle for the control of data has begun and as organisations figure out what to do with all the data moving around inside and outside the company boundaries, is it quality or quantity that you should care about?

This can depend on a number of factors, but we believe a key aspect is recognising where on the knowledge maturity scale you are. We consider ‘maturity’ based on values such as processes, governance, systems and infrastructure. This can influence how data is mined, managed and assessed. Knowledge of where you are against each of these elements will illustrate your combined position in terms of ‘knowledge maturity’. This will inevitably change over time and can be at different stages within the different parts of an organisation.

Early in the ‘Knowledge Maturity Model’, organisations often don’t explicitly set out to capture data.  They often don’t have a plan. The data they have, in whatever quantity it is captured, is often a by-product of ‘just doing business’.  In recent times, organisations have become more savvy to the value and power of data, so some do explicitly capture this data but without a real clue as to what they will do with it or what is its true value.

A good example of this approach to data capture is Google. They keep all the data they collect and have made an art form of ‘connecting the dots’ in their datasets. Earlier in their existence, back in 2008 to be exact, Google set up ‘Google Flu Trends’. They had discovered a pattern of search terms that people entered related to cold and flu symptoms¹. By monitoring search behaviour of millions of users they could analyse whether there is a presence of flulike symptoms in a population. By cross referencing this with purchase information for flu remedies and doctor visits, they were able to map the spread of the flu and form a predictive model. During the 2009 flu pandemic, in the US, Google Flu Trends tracked information about flu in the United States. In February 2010, the CDC identified influenza cases spiking in the mid-Atlantic region of the United States. However, Google’s data of search queries about flu symptoms was able to show that same spike two weeks prior to the CDC (Centre for Disease Control and Prevention) report being released.

In its first iteration, this was an example of quantity over quality. Google have since revised their model and the algorithms used to improve the quality of the data, and in turn the predictability of the trends. This illustrates a shift towards quality, once you have enough quantity to prove a theory or hypothesis.

The previous example identifies Google using quantity before quality – a company many would certainly see as a ‘data-mature’ organisation. This illustrates that organisations at any stage of knowledge maturity can use quantity to their advantage.

Organisations who are less experienced at mining data in this way, may still be seen to be ‘Data-mature’ because they structure their thinking around data and purposefully use it to structure predictive models and analytics. Utility companies are an example in the UK, where the millions of data points they have across their network, in the distribution process and even within the HQ, allows them to build very complicated predictive analytical models. They know what they need, know where to get it from and know what to do with it. They are definitely focused more on quality. However, some are learning about the benefits of collecting all data and applying a quantity over quality approach on areas of data they don’t yet fully understand.

There are examples in the industries mentioned above, where companies are actively striving for quality and quantity for different purposes. The increased amounts of data that are more readily accessible provide insights that the organisations weren’t even aware of….the ‘we don’t know what we don’t know’ syndrome. Improved maturity and experience across both of these dimensions can drive real value, even though the knowledge maturity model may be at different stages within different parts of the organisation.

Getting this balance between quality and quantity is crucial in building real maturity in knowledge – recognising the need to capture data for potential future uses. Companies at the highest level of maturity continuously evaluate their data and its derived information and try and distil that further into knowledge and action.  They look for relationships in the data that they didn’t know existed and try to enrich that information to make better decisions.  “We know what we know, and we know there is more to know!”

The key thing to remember is always to align data collection and analysis needs to business drivers.  However, don’t think that just because you are low down the knowledge maturity scale, you are behind the curve.  A company trying to be all things to everyone when they have neither the data to do it nor the business need or knowhow to do something with what the data may tell them is destined to waste time and money.  There is a sweet spot of data collection and analysis for every company. You just need to figure out what is it and whether you need to push for quality or quantity first!

When considering, and evolving, your position on the knowledge maturity model, careful thought needs to be given to your quality strategy. A comprehensive, corporate quality strategy needs to incorporate the role data plays on processes and performance management. A new approach to the management of quality throughout the organisation can help change how you view and interpret data, creating business benefits in areas that you may not have seen previously.

In our next data quality blog we will explore the different stages of maturity in more detail. The final entry will look at the relationship between your quality strategy and data.

¹Sourced at http://en.wikipedia.org/wiki/Google_Flu_Trends  

Data Quality