Friday, 19 September 2014

Searching the PinCode for your sister's residence? IBM InfoSphere Reference Data Management

Why is there a delay in delivery when the Pincode mentioned in a post is incorrect?
The Postal Department has data which says that the mails with xxxxxx Pincode are taken to be delivered at Post Office ABCDE area.  And hence the post has reached a wrong area.

The data held at the Post Office is one good example of Reference Data.  An area's Pincode is the same across the globe and hardly gets modified.  The addition of new Pincodes is also rare.  It is these attributes that help us classify Pincode as reference data.

The other examples of reference data that we use commonly are gender, telephone codes for cities and states and currencies of countries.

In usage, reference data is stored with a code type and a code value.  Other required values could also be stored for each row.  So, we could have a code table for Country, Locale, City etc.

The types of reference data required vary between industries. The Train Number and its route could be an essential reference data for our Railways.  A Department code may be important for  many other industries. 

While some codes like the telephone country code are common across the globe, some of them are specific to the industry or organization

The IBM InfoSphere Reference Data Management Hub helps in managing reference data by providing centralized management and secure data stewardship features.  The solution also provides capabilities for creating data set maps and hierarchy mapping of data.  The system is built on IBM InfoSphere Custom Domain Hub and hence can use the features provided by CDH.

A simple example on how RDM improves data quality is giving our city name in a web site.  Assuming the company uses reference data, internally a table will be maintained for countries, with a code for each country and the name of each country.  Another table will hold the a code and a name for each state.  This table will have a country code value, pointing to the country in which the state is.  Similarly, another table holding cities in states will be maintained.  When a user enters his or her city, the user will first select a country from a drop down.  Based on the country selected, the set of states in that country will be populated.  After the user chooses the state, the cities in that state will be populated and the user will select his city.  All users residing in Bangalore, will provide the same value (when we type it in, it could be BANGALORE, Bangalore or Bengaluru).

The advantage of following this method is that the city value is obtained and stored accurately.  A country code, state code and city code will be stored for each entity.  The country - state - city is an example of hierarchy mapping of data. 

Links:
Managing Reference Data with IBM InfoSphere Reference Data Management





Tuesday, 9 September 2014

Three Customer Ids with One Insurance Company !!!! - IBM MDM Workflows with Data Governance

Believe me, I have three customer Ids for three policies with one insurance company.

I must have filled three forms at the same time, and in each one, I would have mentioned that I have no existing Customer Id with that company.

Could I have been given a single Customer Id?


Yes, with the use of IBM Master Data Management, Business Process Manager and Data Governance.

When a new customer's details are being entered, the system could perform a check whether there is an existing customer with the same name, date of birth and PAN (a customer's master data).  If all of them match, the could directly map that customer to the existing customer id (overcome data replication).  Supposing one of them do not match, say the name, the system could inform the data operator that there are chances that this customer could be the same as the existing customer.  The data operator could then decide on whether the customer is the same person or not.

Here, storing a single version of customer details, the master data about the customer has to be done by the IBM Master Data Management system.  The process of checking whether the customer already exists and taking then taking the required action based on whether a record is a process (workflow).  This workflow is designed and executed using the IBM Business Process Manager.  Data Governance is the interaction with the master data by the system or by a person.

A workflow that has proper governance ensures that the master data is accurate and that it is the single version of truth.  This helps companies in making efficient business decisions.

The IBM MDM Application Toolkit is a plugin that can be be used with IBM Business Process Manager to create processes.  These processes, it is possible to perform search, create and update operations on data in MDM.

IBM MDM also provides tools for monitoring data and user interfaces to maintain data quality.

Links
InfoSphere MDM for Master Data Governance with MDM Workflow
Integrate BPM with MDM

Monday, 8 September 2014

Bluemix - Liberty for Java with SQLDB

I wanted to work with a database on the Bluemix, preferably a NoSQL.  After getting to know that SQLDB, which is powered by DB2 is available on Bluemix, my attention turned towards it.

The steps used to create an application using Liberty for Java with SQLDB is given below.

1. Login to Bluemix (Register yourself as a new user, registration process is simple)

2. Select Liberty for Java (available in the Catalog Tab) as the Runtime and provide a name and URL for the application.  I have named it SQLJava and will refer hereafter with this name.

3. Select SQL for Database in the Data Management section and create an instance.  Associate this instance with the Liberty for Java application.

4. Goto the Dashboard and expand SQLJava.  In the top left corner, you will find link to View Guide.  Click on this link.

5. Follow the steps given in the Guide.  The Guide asks you to download Cloud Foundry so as to make use of the cf command.

6. Download code that has been generated for this application.  The link to download is available in the guide.  The downloaded code will be in the form of a zip file.  Expand the zip file

7. I am assuming here that Eclipse has been installed (Eclipse Luna version can be downloaded and installed and WebSphere Liberty server is installed on Eclipse).

8. Import the code that has been downloaded and extracted into an Eclipse workspace.

9. Click on the src folder a Create a new Servlet.

10. Use the annotation @WebServlet("/MyFirstServlet") above the servlet name.  This name will be used to connect to the Servlet from an Explorer.

11.  Within the class, declare a DataSource and provide a @Resource annotation to it.  This resource has to point to the SQL database we created.  Ensure that the value of the database name given in lookup property is the same as the SQL database that was created.
    @Resource(lookup = "jdbc/chitradb")
    private DataSource myDataSource;

12.  This Data Source can then be used within the doGet() method to create a Connection object.
Connection connection = myDataSource.getConnection();

13. Complete the code to perform required tasks within the Servlet.

14. For the compilation of the Servlet and the build and the jar javax.servlet_*.jar within the dep-jar folder.  Add this jar to the classpath and also to the build.xml file, under the classpathDir.

15. Delete the war file webStarterApp.war, a new war file with our changes has to be created.

16. Right click on the build,xml file and Run As Ant Build.  Ensure that there are no errors during the build process.

17. Refresh the project to find the new war created.

18.  Now, login to the Bluemix using the cf command
cf login -a <bluemix url>
Provide the username and password.

19.  I went to the directory in which the war file is present and directly gave
cf push
The new application was uploaded.  Ensure that there are no errors during this process.

The application is up and running and can be accessed, using the <Application URL>/<Servlet Name>

Links Referred:

http://ryanjbaxter.com/2014/03/24/pushing-a-java-app-to-bluemix/
https://developer.ibm.com/bluemix/2014/02/07/java-db2-10-minutes/