Posted by Muneeswara C Pandian
December 18th, 2007

Following are the design aspects towards getting a DI system dynamic

  1. Avoiding hard references, usage of parameter variables
  2. Usage of lookup tables for code conversion
  3. Setting and managing threshold value through tables
  4. Segregating data processing logics into common reusable components
  5. Ensuring that the required processes are controllable by the Business team with the required checks built in

We had defined the first two aspects in the earlier writing, let us look at the scenarios and approach for the other three items

Setting and managing threshold values through tables

In data validation process we also perform verification on the incoming data in terms of count or sum of a variable, in this case the validity of the count or sum derived is verified against a pre defined number usually called the ‘Threshold Value’. Some of the typical such validation are listed below

  1. The number of new accounts created should not be more than 10% (Threshold Value) of the total records
  2. The number of records received today and the number of records received yesterday can not vary by more than 250 records
  3. The sum of the credit amount should not be greater than the 100000

This threshold value differs across data sources but in many cases the metric to be derived would be similar across the data sources. We can get these ‘threshold values’ into a relational table and integrate this ‘threshold’ table into the DI process as a lookup table, this enables the same threshold based data validation code to implemented across different data sources and also apply the specific data source threshold value.

Segregating Data Processing Logics into Common Reusable Components

Having many reusable components in a system by itself makes a DI system dynamic or adaptable, the reason being that reusable components work on the basic aspect of parameterization of inputs and outputs of an existing process and parameterization is a key component to get a DI system dynamic. Some of the key characteristics to look for in a DI system that would help carve out a reusable component are

  1. Multiple data sources providing data for a particular subject area like HR data coming from different HR systems
  2. Same set of data being shared with multiple downstream systems or a data hub system
  3. Existence of an industry standard format like SWIFT, HIPPA either as source or target
  4. Integration with third party systems or their data like D&B, FairIsaac
  5. Changing data layouts of the incoming data structure
  6. Systems that capture survey data

Ensuring that the required processes are controllable by the Business team with the required checks built in

In many situations we are now seeing requirements where in the business would be providing regular inputs to the IT team of the DI systems, these are the situations where we can design and place the portions of the DI system parameters under the business control. Typical examples of such scenarios are

  • In ‘threshold value’ based data validation, these values would be business driven i.e., ‘threshold table’ can be managed by the business team and they would be able to make changes to the threshold table without code changes and without IT support
  • In many scenarios the invalid data would under go multiple passes and be need to be validated at different passes by the business in terms of starting a BI session, the input from the business could be just starting the process or as well providing input data
  • The data to be pulled out from a warehouse based on a feed from an online application; a typical web service problem-solution

The need for the business team to control or feed the DI systems is common with companies that handle more external data as with market research firms and Software As A Service (SAAS) firms. The web service support from the leading DI vendors plays a major role in full filing these needs.

Comments (0)