Managing complexity

The whole history of application development with computers has been management of complexity. The focal point has changed in the course of time due to exponential growth of computing power of the hardware.

This complexity can be divided into two aspects: domain model complexity and full application complexity of which the former is clearly a subset.

In this post I will analyze the object model complexity and I will return to other aspects in later posts.

When we have developed application for business needs for companies, they have always been aimed at operation automation and management.  This is the main trend still. The model in the core of the application has to be a model of the operational realty. In other words it has to reflect the reality that the business is part of.

The purpose of this model is to capture the deepest flow of the business events in that reality. Every reality can be modelled as set of objects , their relationships, their internal behaviour  and collaboration with other objects.

The number of ways that we can limit or hide complexity is very limited. Our only tools are abstraction, which can only hideaway chosen amount of detail and identification of natural structures that limit the possible combinatorial explosion of behaviour.   As we have many times seen abstraction is a double-edged sword. Having an overdose of it will definitively kill.

Atomic objects (even as  this is a relative concept) are always simple. The complexity of the world is mostly manifested by the relationships between the objects.

By indentifying and understanding the structures of the reality we can limit the combinatorial explosion in the abstraction of behaviour. This requires distribution of the behaviour into the structure. This is the most effective way to minimise the behavioural description. This is way we need objects. Optimal domain model complexity yields to least complex system.

Complexity and 3-tier architecture

We can still decrease the overall complexity of an application by applying structure of “second order” . This means isolating all business structures and behaviour in one separated business domain layer and connecting that as loosely as possible to other layers of the application. Here the clue is that there are no references out from this business domain layer. This means that the whole layer “doesn’t know anything of its surrounding” or how, where from and why it is used.  This principle was clearly realized already in the introduction of MVC-pattern by Smalltalk developers somewhere late 1980’s.

I have call this extension of MVC to 3-tier architecture (MVC)^^2 pattern.

Business domain model complexity

This deals actually with object model complexity but fulfils our needs as domain models are a genuine subset of object models.

I have written this at least 5 years ago, but it seems to me as valid now as at the time of its writing:

The complexity of object model

Background

The complexity of domain model is very important. As we all intuitively know the complexity grows non-linearly with the increase of classes in the model. It could be that the growth rate is > n2 . To have some practical use -a real measure of complexity we need something more precise.

It is evident that the things that constitute the complexity are at least:

  • number of classes
  • number of association between classes
  • number of attributes
  • number of services

NK-complexity and the biological derivations from that made by

Stuart Kauffmann: At Home in the Universe: The Search for the Laws of Self-Organization and Complexity (Paperback)

(at Amazone

http://www.amazon.com/exec/obidos/tg/detail/-/0195111303/103-2499940-3404643

http://www.amazon.com/exec/obidos/ASIN/0195079515/qid=1059554173/sr=2-2/ref=sr_2_2/103-2499940-3404643

)

inspirited the idea that the base line of complexity comes from the product of the number of classes and associations.

On the other hand it has been long and well know fact that a even distribution of both attributes and services is a sign of good design. According to my intuition and experience this is valid for the distribution of connections between classes as well.

The measure

The source of complexity is two fold. The primary and genuine source is the selected domain itself. The other is in the creation of the model itself. The former cannot be influenced but the latter we can try to minimize.

The complexity measure is following:

Cm = c * avgr( ri )*( stdev (ri ) +1)*((avgr( ai ) * (stdev ( ai )+1) + avgr( mi ) * (stdev ( mi )+1))

c = number of classes in the model

ri =  number of relations in the class i

avgr( ri  ) = average relation in class

stdev ( ri ) = standard deviation of relation over classes

ai = number of attributes in class i

mi = number of methods in class i

avgr( ai ) = average attributes in class

avgr( mi ) = average methods in class

stdev ( ai ) = standard deviation of attributes over classes

stdev ( mi ) = standard deviation of methods over classes

(Here relation means any kind of connection to a class (association or inheritance)

Theoretical basis

Let’s start with a given set of behaviour within a given domain.

This can always be modelled with an object model. In this object model the required and sufficient artefact are:

Class diagram with class descriptions

A set of collaboration diagrams.

A set of state diagrams (optional)

This is indeed all we need (actually without state diagram). This is a precise simulation model of the reality within the domain scope.  All rest is explaining, redundant or in worst case conflicting. Even state diagrams are overlapping, but they give neat additional view of time within one object –object’s essential life cycle. This is described in indirect way in collaboration diagrams too.

Within an object model the total behaviour is reached always with a set of services that will have the predefined behaviour. One service is composed of object’s own actions and a set of messages that it send to initiate services on other object. So each service is own actions combined with collaboration with set of other object that contributes to fulfil the initial service request.

The objects own actions are described with algorithms that operate on objects attributes.

The total business behaviour is the transitive closure of these services. This way we can create an abstract model of any total behaviour. This model has a homomorphic mapping to any object-oriented programming language. We call this mapping implementation of the model.

Now the degrees of freedom here are in the structure of the model and the distribution and placement of the behaviour (namely services) across the web of classes. All the classes (the name space) can bee freely chosen. Thus the whole static structure is completely in hands of the modeller.

    Considerations

  1. 1.    finding: Number of classes in the model

The number of classes can be freely chosen. We can always start with one class. This structure is isomorphic with on-object programmed implementation.

We can expand the class structure by adding a new of classes. At first I thought that there is an upper limit for this, but at the moment I am not quite sure. We can of course think that ( at least in most of our cases) there is absolutely finite collection of actions in the domain scope and thus this would be the upper limit of the classes too, but the fact (that I can not prove though) is that to be able to add the behaviour to the classes as well we need increasing amount of control or managing classes and this can be the source of class explosion.

There is a correlation between number of classes and complexity minimum.

  1. 2.     finding: Ruggedness of the fitness landscape of the class structure

From NK-complexity Kauffman (see Stuart Kauffman; At home in the universe ) has developed dependent complexity. He studies show that when K indicates the number of dependent objects on an object and when K is increased the ruggedness of the fitness landscape increases. This will effectively disable the possibility to reach highest peek ( actually in this case it is the lowest valley of complexity). This urges the need for loos coupling between objects. This can of course conflicting demand with the required behaviour.

The best way to achieve most of this is to follow next 2 simple rules:

  1. Create the service to any class of the web that ‘knows’ the most of the matter at hand (e.g. has the most attributes associated to the action).
  2. If one object can not complete the task it should delegate the responsibility for those parts completely to other objects and to collaborate with them.
  1. 3.    finding: Patching of the class structure

Kauffman also finds that the whole is wise to divide into subgroups to increase adaptively.  The analogy here in domain model context could be Packaging. This is could true (see. Grady Booch Object Solution) but I have not seen it and I doubt it..

One kind of patching of course is the divisions of aspects: domain and application.

  1. 4.    finding: Number of services

Here the arguments including the number of services in a class into the complexity measure are somewhat conflicting. We all know by experience and recommendations that the number of methods behaves in the same way as the number of classes. The optimum of complexity is somewhere between t1 and t2, where t1 is the number of attributes  –and t2 is perhaps around 10 – but at least bigger than t1. Currently the formula does not treat the count of methods quite correctly. When we consider the implementation (not actually know yet in the analyses phase) we know that there is an other correlation between the size of a method and the number of methods in a class (or at lest in the model).

  1. 5.    finding: Counting the collaboration complexity as well

Actually it is most evident that some knowledge of the total complexity of the model is also in the collaboration diagrams, but this knowledge is not utilized in any way –at lest yet!