Data Integration Checklist – Environment Setup & Process Design

Data & AI Solutions

March 19, 2009

Following are the key points that needs to considered when setting up a Data Integration environment.

Data Integration Environment Setup

  • Repository setup and folder structures to hold the development objects (code) like transformations/mappings/jobs.
  • Coding standards and development process.
  • Document templates for low level design specifications and for capturing test case & results.
  • Version management process of the objects.
  • Backup and restore process of the repository.
  • Code migration process to move the object from one environment to the other like from development to the production environment.
  • Recommended configuration variables like commit interval, buffer size, log file path etc
  • User group and security definition
  • Integration of the metadata of the database with the DI metadata and that of the DI metadata with the reporting environment
  • Process for Impact Analysis for change request
  • Data Security needs for accessing the production data and the process of data sampling for testing
  • Roles and Responsibilities of the environment users like Administrator, Designer etc

Data Integration Process Design

  • What are the different data sources and how are they to be accessed.
  • How the data are provided by the source systems, is it incremental or full feed, how to determine the incremental records.
  • What are the different target systems and how would the data be loaded
  • Validation and reconciliation process for the incoming source data
  • Handling late arriving dimension records
  • Handling late arriving fact records
  • Having dynamism in the validation and transformation process
  • Error handling process definition
  • Table structures for holding the error data and the error messages
  • Process control or audit information gathering process definition
  • Table structures for holding the process control data
  • Determining reusable objects and its usage
  • Template creation for commonly used logics like error handling, SCD handling etc.
  • Data correction and reentry process
  • Metadata capture during the development process
  • Means of scheduling
  • Initial data load plan
  • Job failure and restartability methods

Related Blogs

Every outcome starts with a conversation

Ready to Pursue Opportunity?

Connect Now

right arrow

ready_to_pursue
Ready to Pursue Opportunity?

Every outcome starts with a conversation

Enter your name
Enter your business email
Country*
Enter your phone number
Please complete this required field.
Enter source
Enter other source
Accepted file formats: .xlsx, .xls, .doc, .docx, .pdf, .rtf, .zip, .rar
upload
9I5NHS
RefreshCAPTCHA RefreshCAPTCHA
PlayCAPTCHA PlayCAPTCHA PlayCAPTCHA
Invalid captcha
RefreshCAPTCHA RefreshCAPTCHA
PlayCAPTCHA PlayCAPTCHA PlayCAPTCHA
Please accept the terms to proceed
thank you

Thank you for providing us with your information

A representative should be in touch with you shortly