|
Beyond Bi
The quality of data is critical
The quality of data has a direct bearing on revenues, says
Sudipta K Sen.
Billions of dollars are lost annually because of poor data
quality. The real cost of poor data quality is much higher. Beyond wasted resources
there are disgruntled customers, falling sales revenues, erosion of credibility,
and the inability to make sound business decisions. Sometimes, the effects of
bad data are cause enough for complete business failure.
So what is data quality? It is often defined as the process of arranging information
so that individual records are accurate, updated and consistently represented.
Accurate information relies on clean and consistent data that usually includes
names, postal addresses, e-mail addresses, phone numbers and so on. The other
aspect is data integrity. For example, receiving an order date when you need
a settlement date would be a case of data integrity breaking down. In line with
this, the principle garbage in, garbage out becomes an unfortunate
reality when the data quality and data integrity criteria are not met.
Undoubtedly, if the data coming in is of poor quality, type and quantity, then
the GIGO equation is amplified and the return on investment (ROI) on underlying
applications/systemsfor example, CRM or data warehouse projectswill
be nil. And as is typically the case, not until an initiative is deemed a failure
or the ROI not achieved does data quality come to the forefront.
A million-dollar question then iswhy is the quality of data that companies
collect so poor? There are a variety of reasonseverything from the very
ambiguous nature of data itself to the reliance on data entry perfectionbut
none are more compelling than the simple fact that companies rely on so many
different data sources for capturing information.
|
The single-most challenging aspect
for companies is to recognise and determine the severity of their data
quality issues, and face the problem head-on to obtain a resolution. Spending
money, time and resources to collect massive volumes of data without ensuring
its quality is futile and only leads to disappointment
|
Typically, organisations collect data from various sources: legacy, databases,
external providers, the Web, etc. Due to large amounts of data from a variety
of sources, quality is often compromised. It is a common problem that many organisations
are reluctant to admit and address. The single-most challenging aspect for companies
is to recognise and determine the severity of their data quality issues, and
face the problem head-on to obtain a resolution. Spending money, time and resources
to collect massive volumes of data without ensuring its quality is futile and
only leads to disappointment.
However, there are three main reasons why this practice is easier said than
done.
Firstly, IT managers find it difficult to label data quality
as a problem without at the same time admitting that there is something
wrong with their systems. Second, IT managers are afraid to really look at data
quality, and be forced to change their current business plans. Finally, the
costs of poor data quality are spread widely around the organisation.
Organisations depend on data. Regardless of industry, revenue size or the market
it serves, every company relies on data to produce information for business
decision-making. Meanwhile, information is all about integration and interaction
of data points. Inaccuracies in a single data column can ultimately affect the
results of business decisions and may directly affect the cost of doing business.
Preventive measures to ensure data quality is usually more economical and less
painful. Delaying the data cleansing process dramatically increases the cost
and time of doing so.
In line with this, cleansing data at the source is a significant way to enhance
the success, of say, a data warehouse or CRM project. Thus, it becomes a proactive
rather than a reactive model. As we have seen, simply collecting data is no
longer sufficient. It is more important to make proper sense of the data and
ensure its accuracy. As the amount of data escalates, so does the amount of
inaccurate information obtained from it. Data should be cleansed at the source
in order to detect and address problems early in the process so that quality
issues are prevented further down the line.
In the current scenario, it is encouraging that although data quality may not
be the most important problem today, it is certainly near the top of the list.
The underlying factor for this is the rapidly accelerating data problems fuelled
by the escalation of interest in application-to-application integration and
business-to-business exchanges.
In fact, this growing trend will make data problems worse in the short run.
The reason is that, as always, organisations will gravitate quickly to the interesting
new technologies (XML, for example), and ignore the more complicated and messy
problems of data quality. This, in turn, may bring data quality problems to
the forefront, potentially bringing long-term solutions. It is especially true
as bombarding your business trading partners (both customers and suppliers)
with poor data will be harder to sustain than bombarding ones own management
and knowledge workers with poor data. Trading partners will desert you for somebody
else.
Data should be treated as a strategic asset wherein ensuring its quality is
imperative. If data is of a good quality, then knowledge workers who query the
data warehouse and decision-makers who receive the information cannot trust
the results. It is the building block of an intelligent enterprise. Data quality
is a business management as well as an IT management issue. Although it may
be the IT departments job to raise the issue, solutions will emerge from
the users of the data in the business.
|