2012 Wish List for Data Integration Platforms
Data Integration (DI) platforms like Informatica, DataStage, BODS, ODI, SSIS and others are the backbone for a well designed Business Intelligence application. DI platforms ensure that all the required data are provisioned at the right time, with quality.
In the process of configuring and running DI platforms, we come across few scenarios where DI platform needs to be extended with tools/utilities. Some of these scenarios are like analysis of the metadata captured by DI platform, integration with other applications like helpdesk and others. These scenarios can be added as a product feature to the DI platform.
Following are the features that Data Integration platform vendors should consider introducing in the DI product.
- A feature which can provide a Network-like view to understand the dependency between tables present in DI process components. DI process components are ‘Jobs in DataStage’ or ‘mappings in Informatica’. The Network view should enable us to drag a table and focus on its dependency on other DI processes and tables
- Provide an interface to configure the default object names that would be assigned to an object, like widget or transformation (as in Informatica), or an object like stage (as in DataStage). The configured name should be assigned when the icon is dragged into the design interface by developer.
- Prebuilt integration with Helpdesk applications like Remedy to log data errors and job failures as a ticket into the helpdesk system
- ‘Data Element Aware’: This feature should understand a column’s data values like of ‘people name’ and then suggest checks and validations that needs to be performed for such columns
- ‘Domain Data Aware’: This feature should understand the business areas like Investment Bank, and have pre built data validation rules to handle external data like Bloomberg
- Enable calculations like aggregation and metric derivation to be interoperable with the semantic layer metadata of reporting tools. For example if a Business Objects Universe is given as input to a DI product, then the DI product should be able to suggest what calculations in the Universe can be moved to the DI layer and the suggested data structure to store the calculations
- Data Correction interface for analysts with the workflow feature to look at the error data, correct, approve and re-process them back into the target tables
- Design templates for scenarios like data archival or data cleanup from large tables, for comparison of two data sets (data set can be a table or file), auto data type conversion will be useful for migration of data from one database to another database
Thanks for reading, let me know your thoughts on what are the other features that you feel should be considered in a DI platform.