Applying Baysian Networks for Automated Modelling of the Archimate Technology Layer

Applying Baysian Networks for Automated Modelling of the Archimate Technology Layer

Organizations face increasing challenges in managing their IT architecture and landscape and in coordinating a multitude of IT projects. As modern businesses very much rely on their IT infrastructure, optimally aligining business and IT has become especially important. Enterprise architecture managment (EAM) is an approach to manage modern IT infrastructure in an organization using modeling tools such as Archimate. Because these models have traditionally been created by hand, efforts to manage an organization’s IT architecture have been both time-consuming and error-prone. Recently some approaches to achieve at least partial automation of these processes have been presented. The paper of Johnson et al. “Automatic Probabilistic Enterprise IT Architecture Modeling: A Dynamic Bayesian Networks Approach” represents one such approach and suggests to model the nodes and relations of an observed network using Dynamic Bayesian Networks (DBN). Using this DBN as an input one can now model the technology layer (and in parts the application layer) in the Archimate EA modeling language.

In this thesis I will build such an observable sample network, which I will then attempt to model in Archimate using the approach presented by Johnson et al. This network should consist of several computers and applications that are commonly used in organizations such as web servers and mail servers. I will evaluate different active and passive network sniffers and scanners and choose a fitting tool for the task. It would be useful to passively measure traffic between network addresses and log source and target address as well as port to be able to make assumptions about machines and applications used in the network. However it is also possible to additionally employ an active scanner such as nmap to actively probe the network and to find machines and open ports from which we can again infer applications used on these machines.

Using those network scanners I will then scan the network traffic and analyze and process the traffic generated by the example applications. Not only will I inspect each packet’s source and destination address but also source and destination port to learn about applications used on these machines. The result will then be used as input for a Dynamic Bayesian Network which is continually updated based on the most recent measurements so that eventually all devices and their applications that generate traffic in the network can be found and included in the DBN.

In the DBN each boolean variable corresponds to a network address port pair. For each network address and port (and each time step) we have a probability that this address is being actively used by a host machine. For each pair of network addresses and ports we also have a probability that a message is being transmitted from one to the other and vice versa resulting from our observations. In the DBN previous beliefs influence current beliefs, thus depending on our most recent observations these probabilities are updated accordingly.

The Dynamic Bayesian Network will be implemented in a graph database such as Neo4j. From this resulting graph I will then create a model of the technology layer in the Archimate modeling language. Depending on current beliefs and whether the probability for a network address being currently occupied by a host machine surpasses a certain threshold, each network address and respective ports from the Dynamic Bayesian Network can be included or excluded in our model. Furthermore I will be able to infer some of the applications used on those machines based on the port. The result should be an accurate Technology Layer Archimate model of the observed network.

Project information

Status:

Finished

Thesis for degree:

Bachelor

Student:

Björn Bebensee

Supervisor:
Id:

2019-004