Remarks on abstract OO domain model: inheritance

There are some issues that I think needs clarification and discussion. The thing that I start with is perhaps not the most important, but important enough. This is inheritance in abstract domain models. As everyone who has programmed any OO-language very well knows inheritance is very prominent difference between 3G and OO languages. This feature shortens a lot the size of the code in OO. However when the model is trying to describe (ie. help to understand) complex domain reality then things get more tricky.

A first purpose of an abstract domain model is to create a simplified picture of reality. This picture contains both structure and behavior of the world. In this context some feature becomes very important. First of all the model should catch as much semantics from reality as possible. On the other hand we have a conflicting demand and that is to limit the size of a model into something comprehensible. My experience is that we are able to manage in our minds 1 – at most about 150 classes’ network. From the current point of view the low end is no problem but the high end is.

I have to say that first symptoms of high number of classes (or at least the fear of it) can be seen indirectly.  Novice modeler tends to draw several (rather) small class diagrams of their model. Ideal number of class diagram is of course one! Then people around the model have possibility to see, feel and understand the complexity of the domain at hand. I have been quite comfortable with 60 -80 classes. Using Peter Coad’s four color categories makes it still easier to read and understand.

So the richness of semantics would enforce as much details as possible. On the other hand we really want simplification to “see the wood out of trees”. These two conflicting forces drives the analysis. This means that all classes included must be very carefully thought. This has two consequences. First the naming of each class is very important. All class names should come out of the language of the business domain. This means than every class will have some directly and intuitively some content for any business person even if the person doesn’t know the model and it hasn’t been explained to that person. There is a tendency to choose selling names that are very general or loosely used.  For example Customer and as a class name is most typical of those which semantic meaning is far too vague. Event, Thing and Role ( these are Peter Coad’s color categories) are far too general (ie. abstract) and thus have not enough semantic substance. By the way using color coding in diagrams every class can be in addition to its name tagged into one of these types.

But let’s return to inheritance. This seems to be quite difficult for beginner, but this is exceptional difficulty because it is mostly overused! This means that our tendency to classify thing is actually misinterpreted to mean a lot of inheritance in OO models. So the advice is never use inheritance just for enumerating classes object along some attribute values.  So the minimum requirement for a specialization class to be added to an abstract model is that it has behavior that cannot be in the general class objects or that the behavior is so remarkably different that it requires own class. Another point is that we actually need several classes ( typically from 2 to few – less than 10) which all have many association to many other classes. Then if the modeler is able to bring in a generalization class that collect the association to be drawn only once.

So instead of this:

perintäAssos2

perintäAssos1

It is better to draw this:

In most domain models the maximum inheritance structures in the whole domain is 2 to 4. If there are significantly more of these then it is either a very exceptional domain or there has been a very novice modeler coaching the team.

Finally adding a class in abstract OO domain model is expensive decision. That is why this move has to carefully considered before taking this step. On the other hand the model should not be too general. Very general models are neat but as they lack the necessary semantics they are good for nothing. If you take the class names from a model and give the list to a domain expert and he or she don’t know where the terms are from then your model is polutions.

My history of IT

1971

I was accepted to study Natural Sciences at Turku University

1972

I started my studies of computer science ( Approbatur I)

http://en.wikipedia.org/wiki/IBM_1130

The computer that we worked with was IBM 1130 with card reader, line printer and one hard disk drive.

The address space was 15 bits, limiting the 1130 to 32K 16-bit words (64 Kbytes) of core memory. Botdirect and indirect addressing capabilities were implemented

IBM1130

The console of the IBM 1130 computer

kortinlävistinPieni

IBM26 Card punch machine

and my firs programming langue was FORTRAN and pretty soon after that COBOL (which was quite exceptional in university circles). AT that time we used (of course) card punch machines and  we got 2 computer runs ( batch jobs) done a day. The student where not allowed to enter computer room at all. We hade to leave our jobs in a box in a corridor out side the computer room.

1975

We got a new “real” computer Digitals DEC 10

From Wikipedia again:

The KA10 had a maximum main memory capacity (both virtual and physical) of 256kilowords (equivalent to 1152 kilobytes).

with timesharing operating system: TOPS-10 System (Timesharing / Total OPerating System) was a computer operating system from Digital Equipment Corporation (DEC) for thePDP-10 (or DECsystem-10) mainframe computer launched in 1967.

At the same time we got rid of the card punch machines and got our first real terminals

This was really a cool new thing ( we didn’t call it cool in those days). The orange thing on the left side of the machine is a paper stripe punch/reader. One could direct the output from computer alternatively come to that stripe as holes instead as print on the paper. At this time the programming language had changed to Pascal. At later stage we got also display terminal, but this was about all that so computer technology during my university years in practise 1973 – 1977. Officially I graduate 1980, but actually I had been full time working a couple of year at that time.

1976-77

My first part time employment as a COBOL programmer. My office was just a couple of hundred meters from university, so it was quite convenient to mix work and studies this way. The project that I worked in was a nightmare!

1979

From the beginning of February I start my first fulltime job as a computer programmer at Yle (Finnish Broadcasting Company). My first assignment was maintaining a new payroll system. It was a huge batch program set. It calculated payrolls for Yle employees at that time 4000.

The system was running on IBM 360 mainframe.

Yle didn’t at the time have its own machine so we bought time from a local authority.

We worked mainly with punched cards but the department with 10 programmes had already 2 online terminals IBM 3270. Here are pictures of the device and a screenshot

Early 1980’s the time of APL

In 1980 I graduated master’s degree in Computer Science from Turku University. In early 1980’s I learned APL-programming language running on IBM mainframe. Yle used this thing for creating domestic result service of political elections ( like parliament and local authorities and presidential). APL was just the right tool in those days for that. It was highly effective to do rather small calculation intensive things. I attended my first international IT-conference on APL in San Francisco in 1981. We where show also Smalltalk, but I did not understand anything about it then.

In 1982 I changed my job and I became a APL-specialist. The company was a joint venture of local authorities. It was extremely bureaucratic organisation. The main thing of course was that I realised that APL is NOT a language for organisations.

1984 Systems engineer at HP

So I did a full U-turn and joined HP 24.10. 1983. I became a systems engineer of HP3000 commercial minicomputer.

HP3000

Code (reentrant) and data reside in separate variable-length segments, which are 32,768 “halfwords” (16-bit words) (or, 65,536 bytes). The operating system, known as MPE (for Multi-Programming Executive), loads code segments from program files and segmented Library (SL) files as needed, up to 256 segments in one process.

There could be as much as 64KB of memory in a code segment, but calling a routine was based on segment number and routine number within a segment, so a program could theoretically have about 32,385 routines. This was compared to most 16 bit computers that had 64KB of address space for everything. The bigger limitation was the data segment and stack segment, which were also 64KB.

I learned a lot of new things. Two of them were important. The first was the operating system MPE (a Unix like but perhaps even little bit better than Unix). The second thing was that contemporary ways of working with computer can have one or 2 magnitude effective differences ( 10 to 100 times different)!

computer74 HP3000

Pretty so our dummy terminals were changed to HP 150 micros. I started to use email system at HP in 1984. My last year in HP I was full time trainer teaching almost everything on HP3000.

HP150 HP150 PC

1988 Finnair

I moved to Finnair to assist their Videotex system to be moved from MP1000 to HP9000 equipment. Soon I created a group Methods and Tools with me boss. Finnair was (and still is ) a IBM customer, so I a way this meant me a return to my roots. Now my main environment was to be Windows 3.11 on IBM AT Pc.

win3.11

NetscapeLogoDo you still remember these?

1989 my OO time starts

I attended late October 1989 a OOA –seminar by Peter Coad. His book Object-Oriented Analysis was not out yet but he had a blueprint with him. This changed my thinking completely. Something irreversible had happened.

In summer 1992 after I hade attended ObjectExpo in London I got my first Smalltalk: Digitalk’s Smalltalk V. This was a horrible experience of completely unfinished poor product. In Jannuary 1994 I got change to get a real (and only) thing: ParcPlace’s VisualWorks Smalltalk.

VisualWorksKansiPieni

ParcPlace sold VisualWorks to CinCom and the product is still today alive and called Cincom Smalltalk VisualWorks (see: http://www.cincom.com/us/eng/solutions/application-development/object-oriented/index.jsp?loc=usa )

Timo Salo gave me and Jyrki Niekkamaa a few days training and then we started to experiment and create applications. Finnair’s car leasing application was the first one to try out. That was a total failure, because we had only one user and he was too incompetent to use a computer application at all!

The first real OO-application development from January 1995

The visioning of an Smalltalk application started in October 1994. Our domain model of Finnair domestic sales was ready at the beginning of February 1995. The started the implementation of legendary Finnair SalePlus. The period from 1995 until 1998 as I resigned from Finnair was the most productive and inspiriting in my career. In 1996 we acquired GemStone – a Smalltalk database. This environment boosted up considerable our Smalltalk application development. The product is still sold and in use (see: http://www.gemstone.com/products/smalltalk/)

The software was very sound and reliable. In 1997 we sadly switched to Java. It was at that time about 5 years behind Smalltalk.

1998 to 2002 Aware and Entra

Java had been developing and the first EJB implementation was released. Though it was terrible and ruined all OO aspects from application development and filled it with all kind of tricks it was a step towards real distributed server Java. It offered and RMI solution which was a read distributed OO with elegant proxy structures in client side. Sad thing was that people with very little OO-skills didn’t understand how to use it the right way and spoiled its reputation.

Another very sad thing was the sift to HTML-client in distributed computing. HTML was (and is not even today) design for application development and was the lousiest possible choice for technology.

My contribution to ADDD.

The start of OO

The classical OO- analysis and design is the cornerstone of OO application development. From those early days of 90’s come the fundamental ideas: domain classes with private attributes and private and public methods placing business behavior into domain layer and splitting the implementation into logical 3-tier architecture. This is all extremely vital also today. Actually almost nothing has changed in these respects that would affect their importance or value.

Popularity and expansion

Along 1990’s when OO gathered popularity and more people where involved things started to screw up. Bigger projects where started with less trained people. To ease the confusion Ivar Jacobsson introduced use cases. The result turned out to be almost the opposite. Many people were confused about use case’s meaning and role. The difficulty with those was (and still is) that they are not actually part of OO neither notation nor concept. Actually wrongly use they can be completely overlapping with the real OO notation. The heart of the problem is very fundamental. When people tried to do more complex things with OO they run into the problem of abstraction levels.  (there are more detailed posts on both previous subject in this blog). The consequences of this all was that people started the development  at too low abstraction level and they were hit badly by the combinatorial explosion and as always when a team starts to use a new method or discipline and they are not enough trained the method get the blame (of course not the people)!

My findings are:

First I studied the relationships between the layers in the logical 3-tier architecture. It was evident (and often told allowed) that the responsibilities of the layers got mixed in the real implementations and the more inexperienced implementer were the fuzzier the layers became. In the early days all the gurus advised to start from application requirements and work through application controls to domain layer. In most cases this ended to a catastrophe due to rabbit complexity growth.

My analysis came about with the following corrections:

1) Tight couplings:

Domain itself is always tightly coupled within itself. This actually is the pure meaning of the domain. Theoretically (and in practice too) is forms a transitive closure.

Both outer tiers are tightly but at the same time asymmetrically coupled with a subset of domain. Asymmetry means that the outer tier is tightly coupled with domain but the reverse is not true. The domain is not at all coupled with application tier and each domain class has exactly one hook to the persistence tier. The domain doesn’t know anything of the rest of the world. See the next diagram:

Coupling in 3-tier implementation

2) Timing

All the old gurus suggested to start from application requirements. I think this is wrong! Thus my first modification is on timing: Start from abstract domain OO model without even touching the application part. This means that the business behavior (rules) must be modeled with object collaborations. This will imply that the domain OO model is modeling a subset of reality of which the organization is interested and where it operates.  Here the emphasis is on by whom not how. Of course the where also includes how, but the point of view is that the organization is not acting alone but rather where and how it interacts with others. So this model is actually a subset of reality not organization. Starting here actually implies that the domain modeling is about understanding complex reality and is the least interested in any current or future IT system what so ever.

Start from a “clean table” and first do your analyze on concepts and bring about your initial class structure. By the way UML class diagrams are end results but their versioning is an instrument. Without drawing things wrong you cannot draw those right.

3) Control over the level of abstraction

The second thing is to control continuously and knowingly the level of abstraction. This is very important as well. The main goal of the initial domain model is to understand and to get a grip of the target reality. Thus here the most important task of  the modeler role is to keep the amount of detail as low as possible and still get as much structure and semantic as possible. These goals are evidently contradicting so you have to try to balance between them.  In the initial abstract domain modeling phase the amount of classes increase rabidly close to implementation phase level but the number of attributes should say quite low during the whole abstract analysis. An average amount is around 5 % – 25%. It should never (even in very small models) exceed 65 %. The final set of attributes should be discovered though application (UI) designs.

4) Abstract analyze phase

The responsibility to create abstract domain model belongs always to people with best operational knowledge of the business. If the group includes also top management decision power this will ease and speed up development but this is not in any way necessary. This is always group activity. It is extremely important that most of the important interest groups are presented. The starting phase is conceptual clarification and language calibration. Actually this gives the language of the business. Of course there has been a language before this, but this is excellent opportunity to create crisp sharp edged concepts and their relationships. The name of this all is an abstract OO domain model. This phase takes typically 3 – 10 working days of the group. It should be intensive enough an optimal timing is 3 * 6 h session a week for a couple of weeks. This should ever exceed 15 working days. During this process will be created 3 – 10 versions of the domain model.

5) Application design & implementation

I am convinced that agile methods are the best way to implement application. But before the development SCUM with its sprint cycles I think that it is wise to have this abstract analyze phase before it. The iteration starts from analysis domain model that is generated into some OO -programming language for instance into Java. The development of different use cases of the application can happen almost independently. Of course there are common parts (GUIs and control) that can be shared but even this is not necessary. Each iteration will add new attributes into domain. The need for these rises from GUI and workflow control.

Considerations

Is this difficult? This subject divides people. Some of us think this is difficult others don’t. I have considered this (all my live) easy. The basic think is to grasp the world as a stage for individual actors to collaborate. If one see the word in this way: collaborations of autonomous agents, then there should be no obstacle to consider OO-models easy. I actually get support to this view, when I create object diagram to illustrate some important or complex single case instead of class diagram.

One or the professional skills here is the ability to read class diagrams. All professional OO-modeler have to be able to “see” though a given class diagram all the possible cases that yields from this.

*ADDD ( Abstract Domian Dirven Development)