Friday, 19 September 2014

Searching the PinCode for your sister's residence? IBM InfoSphere Reference Data Management

Why is there a delay in delivery when the Pincode mentioned in a post is incorrect?
The Postal Department has data which says that the mails with xxxxxx Pincode are taken to be delivered at Post Office ABCDE area.  And hence the post has reached a wrong area.

The data held at the Post Office is one good example of Reference Data.  An area's Pincode is the same across the globe and hardly gets modified.  The addition of new Pincodes is also rare.  It is these attributes that help us classify Pincode as reference data.

The other examples of reference data that we use commonly are gender, telephone codes for cities and states and currencies of countries.

In usage, reference data is stored with a code type and a code value.  Other required values could also be stored for each row.  So, we could have a code table for Country, Locale, City etc.

The types of reference data required vary between industries. The Train Number and its route could be an essential reference data for our Railways.  A Department code may be important for  many other industries. 

While some codes like the telephone country code are common across the globe, some of them are specific to the industry or organization

The IBM InfoSphere Reference Data Management Hub helps in managing reference data by providing centralized management and secure data stewardship features.  The solution also provides capabilities for creating data set maps and hierarchy mapping of data.  The system is built on IBM InfoSphere Custom Domain Hub and hence can use the features provided by CDH.

A simple example on how RDM improves data quality is giving our city name in a web site.  Assuming the company uses reference data, internally a table will be maintained for countries, with a code for each country and the name of each country.  Another table will hold the a code and a name for each state.  This table will have a country code value, pointing to the country in which the state is.  Similarly, another table holding cities in states will be maintained.  When a user enters his or her city, the user will first select a country from a drop down.  Based on the country selected, the set of states in that country will be populated.  After the user chooses the state, the cities in that state will be populated and the user will select his city.  All users residing in Bangalore, will provide the same value (when we type it in, it could be BANGALORE, Bangalore or Bengaluru).

The advantage of following this method is that the city value is obtained and stored accurately.  A country code, state code and city code will be stored for each entity.  The country - state - city is an example of hierarchy mapping of data. 

Managing Reference Data with IBM InfoSphere Reference Data Management

No comments:

Post a Comment