What should a Data Quality Management Solution comprise of?

Posted by Muneeswara C Pandian
Comments (1)
February 14th, 2011

Data Quality Management (DQM) is one of the very important implementation components in an Enterprise Data Management (EDM) initiative. Following are few definitions to be noted in the context of a DQM process.

  • Exceptions – These are errors determined on the Data
  • Checks – These are rules that are applied over the data to capture the Exceptions
  • Data Quality Management – The end to end process of managing the Exceptions identified by applying the Checks over the data can be called Data Quality Management

The core solution components of a DQM setup that addresses data quality issues in an enterprise can be quite generalized across industry, they are

  • Data Capture & Profiling
  • Exception Detection
    1. Checks Management
    2. Exception Checks
  • Exception Handling
    1. Exception Resolution
    2. Exception Reporting

Following table summarizes what these components deliver functionally and the technology platforms that will be required for the implementation of these components.

Solution Component Key Functionalities Addressed Technology
Data Capture & Profiling
  • The data from the source feed system is monitored, extracted and  made available for the Exception Checks process
  • Incremental data capture process, supporting Pull (pull the data from the sources) and Push(data is pushed from the sources) mechanism
  • Provides platform to understand the data feeds and determine Checks to be applied
  • Change Data Capture, Data Profiling and Data Integration tools
  • Enterprise Schedulers
  • Unix Scripts, PERL etc to manage flat file feeds
Checks  Management
  • Provides the functionality to create and modify Checks
  • Repository of Checks which is accessible during Exception Check process
  • Version management of Checks
  • Secured access and audit process for the Checks
  • Custom web-based applications to manage the Checks
  • Rules engine like iLog with a plug-in to data integration tools
  • RDBMS to store the Checks
Exception Checks
  • Applies Checks over the incoming data and logs the Exceptions
  • Stores the good quality data and the invalid data
  • Performing Checks like data completeness, duplicate data, pre defined business rules, comparing current and previous feed statistics etc
  • Enable reprocessing of the input data after Exception Resolution
  • Data integration platform like Informatica, DataStage
  • RDBMS to store the Exception Statistics, the valid and invalid business data
Exception Resolution
  • Supports processes to correct the Exceptions identified
  • Provides interface for Data Stewards to view, explore the Exceptions logged, add comments to the observations and make corrections
  • Categorization, prioritization and notification of the exceptions
  • Provides features like workflow management for Exception allocation and closure
  • Custom web-based applications to view, explore and correct the Exceptions
  • Integration with helpdesk ticketing tools  like Remedy for automated assignment of Exceptions to owners and workflow management
  • Data integration platform to reprocess the corrected data
  • RDMBS  to store the Exception data and their current status of the Resolutions
Exception Reporting
  • Provides dashboards, reporting and ad-hoc query capability over the Exception Data
  • Dashboards to monitor key data quality metrics
  • Trend Analysis of data quality against the data sources
  • Audit Reports
  • Standard reporting tools like Business Objects, Cognos

The above solution components and the functional requirements can be a checklist for a DQM solution. Also need to note  that to meet the requirements of a DQM solution, multiple technology platforms covering data integration, ticketing application, reporting and custom applications are to be integrated, there isn’t single platform that meet all the requriements.

Thanks for reading, let me know your thoughts on the DQM requirements and solution components…

Comments (1)

kaushik - March 1st, 2011

Your blog on Data Quality Management Solution helped me lot planning some of my strategy. Apart from the table you provided on the components deliver functionally as well as technology platforms i would like to know more on this regard. More on the implementation of these components. Regards, K.kaushik

Comments are closed.