Software systems, in general, need good comprehensive documentation. This need becomes a “must-have” in case of BI systems, as these systems evolve over a period of time. My recommendation (a best practice, if you will) is to create a “Gold Copy” documentation of the enterprise BI landscape. The “Gold Copy” is distinct from the individual project documentation (this would continue to exist) and is updated with appropriate version controls over a period of time.
The “Gold Copy” would comprise of the following documents related to the Business Intelligence environment:
1) Architecture and Environment
This document has two broad sections. The first section explores the physical architecture and then attempts to explore each of the architectural components in more detail. The second section explores the environmental and infrastructure requirements for development, production and operations.
2) Data Sources
This document explores the various source systems from which data is acquired by the datawarehouse. The document attempts to provide a layman’s view of what data is being extracted and maps it to the source system data structures where they data originate from.
3) Data Models
This document builds on the two previous documents. The document attempts to map the source data onto the datawarehouse and the data mart models and provides a high-level picture of the subject orientation.
4) Reporting and OLAP
The document takes a closer look at information delivery processes and technology to the end user from the datawarehouse. The initial section explores the architectural components by way of middleware server software and hardware as well as the end user desktop tools and utilities used for this purpose. The second section looks at some of the more important reports and describes the purpose of each of them.
This document takes a process view of data movement within the enterprise datawarehouse (EDW) starting from extraction, staging, loading the EDW and subsequently the data mart. It explores the various related aspects like mode-of-extraction, extract strategy, control-structures, re-start and recovery. The document also looks at naming conventions followed for all objects within the datawarehouse environment.
The above set of documents cover the BI landscape with its focus on 3 critical themes – Architecture Track, Data Track and Process Track. Each of these tracks have a suggested reading sequence of above mentioned documents
Architecture Track – This theme focuses entirely on components, mechanisms and modes from an architectural angle. The suggested reading sequence for this track is – Architecture and Environment, Data Models, Reporting and OLAP
Data Track – This theme focuses on data – the methods of its sourcing, its movement across the datawarehouse, the methods of its storage and logistics of its delivery to the business users. The suggested reading sequence for this track is -Sources, Data Models, Reporting and OLAP.
Process Track – This theme focuses on datawarehouse from a process perspective and explores the different aspects related to it. The suggested reading sequence for this track is – Architecture and Environment, Process, Reporting and OLAP.
I have found it extremely useful to create such documentation for enterprise wide BI systems to ensure a level of control as functional complexity increases over a period of time.
Thanks for reading. Please do share your thoughts.