Development of a Configurable Data Generator for Enterprise Architecture Repositories
Motivation
Enterprise Architecture Management (EAM) is an area of growing importance for medium to large sized enterprises. In a fast changing and highly competitive economical environment, it is vital for an enterprise to be flexible and to adapt quickly to changes on the market. To be able to respond to these changes, the business strategy and the supporting IT must be aligned in an efficient and effective manner. The complex relations and entanglements between artifacts and entities in the company need to be understood, clearly displayed and managed. The Enterprise Architecture (EA) is the result of this work and gives a holistic overview over the enterprise. The concrete manifestation of the EA is stored in the Enterprise Architecture Repository (EAR).
This data reflects the company from different views in a detailed manner. To avoid competitors getting confidential information about the enterprise, access to this data is restricted. For computer scientists working on new approaches or algorithms in the area of EAM, this results in a problem. Once an algorithm has been designed and implemented, it needs to be tested and evaluated against suitable data. Since access to real data in this area is restricted, scientists only have the options to not publish their results or to construct their own test-data, which is time-consuming and may be biased towards the presented approach as well.
Goals
The intention of this thesis is to fill the lack of a public available reference data set for EARs by presenting a concept on how to construct such a dataset and the development of a configurable data generator for EARs.
Approach
The construction of a whole EA from scratch neither seems to be feasible nor to reflect a realistic EA. To generate a realistic data set, the use of a Reference Architectures (RA) [1] for EAM as a basis for the data model may be appropriate. In order to allow to distort the data set in some way, parameters for the configuration need to be defined. A possible candidate for the parametrization may be a set of Key Performance Indicators (KPIs) [2], which measure how far a concrete EA fulfills certain EAM goals. The parameters then can be adjusted to create a dataset with a higher or lower quality according to the selected KPIs. A prototype of the data set should demonstrate the usability of the KPIs as a mean for parametrization. To keep the generator flexible, the parametrization criterion should be implemented exchangeable. In the next step a mapping needs to be defined between the layers of the RA and the KPIs. Once this is done, the data model should be automatically constructed from the RA. The data generator can now fill the model with data respecting the parameter settings.
Figure 1: Conceptual Overview
Finally, the generator must be evaluated. This could be done by using already published EAM - algorithms and compare, if the algorithms behave the same way as described in the publication and on the generated data set.
In future work, the data generator and its output could establish the basis of a more complex system concerning the testing and evaluation of EAM algorithms. For instance, it is thinkable to construct an EAR – simulator that predicts the evolution of a concrete EA over time [3].
References
[1]. Inital Experiences in Devloping a Reference Architecture for Small and Medium-Sized Utilities - Timm, Köpp, Sandkuhl, Wißotzki - 2015.
[2]. EAM KPI Catalog v 1.0 - Mattes et al. - 2012.
[3]. A Survival Analysis of Application Life Spans based on Enterprise Architecture Models - Aier et al. - 2009